Open arnaudstiegler opened 4 months ago
Hi,
Thanks for your suggestion. We will consider that in the future update.
Alternatively, you may try to create a new pipeline or use some other predefined pipelines. There are 11 types of predefined pipelines, from pipeline_archetype1
to pipeline_archetype11
:
https://github.com/sparkfish/augraphy/blob/dev/augraphy/default/pipeline.py
Thanks for the answer! I didn't know about the predefined pipelines, not sure whether I missed them in the documentation. Are those just "random" pipelines or is there a specific use case / logic for each one?
Thanks for the answer! I didn't know about the predefined pipelines, not sure whether I missed them in the documentation. Are those just "random" pipelines or is there a specific use case / logic for each one?
So each pipeline is meant to generate a specific kind of real life dirty document effect. It should have a consistent output so you will be not able to see much variations in each archetype pipeline.
Hi, I use Augraphy extensively but I've noticed that:
It'd be great to either provide an option like "mild/strong" for the default pipeline to give some control over the default pipeline without needing to deep-dive into the internals of the package.
For instance, this doc is almost unreadable, and training models on unreadable docs can lead to really damaging behaviors like hallucinating answers completely on docs that they can't read