how to improve accuracy?

I've tested on different websites so far and it is only grabbing tiny excerpts it thinks is the main content. While the text is inside the main content, it is ignoring the rest of the text that is still part of the main content.

I've used the recipe to generate the final output text. How can I tweak this so that it can grab the expected main content text?

By default, is it using pre-trained weights? How can I "teach" it so that its accuracy will improve?

So far I tested:

https://news.ycombinator = grabs only the first submission

https://openai.com/blog/openai-pytorch/ = " In the past, we implemented projects in many frameworks depending on their relative strengths. We’ve now chosen to standardize to make it easier for our team to create and share optimized implementations of our models." missing the first sentence and the rest of the text.

dalab / web2text

how to improve accuracy? #10