Closed r1b closed 7 years ago
For now I am going to remove the code that does this. The only thing I will do to the headline is remove leading / trailing whitespace.
I will also add headline selectors for all sources - should help mitigate the issue.
The problem:
Many headlines (esp. from title tags) have a prelude / postlude with a delimeter and the name of the site e.g:
I have a heuristic that strips these out but it is unreliable. Further, sometimes you actually want the info after the delimeter, e.g: