Open rahulbot opened 6 months ago
(started testing work on the feature-favor-precision
branch)
@pgulley can you queue this up to re-asses with the test code vis-a-vis the comment at https://github.com/adbar/trafilatura/issues/584#issuecomment-2233846387
After some digging on https://github.com/mediacloud/story-indexer/issues/278 it looks like tweaking of integration of Trafilatura to use
favor_precision=True
could help. In the sample code I provided on a few test cases from our researchers it helped in 3/4 cases. This needs more vetting to gauge impacts to consider rolling out the change.Test case (change the
favor_precision
variable to see results):