codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
14.09k stars 2.11k forks source link

not getting relevant content for this link #547

Open kalhan123 opened 6 years ago

kalhan123 commented 6 years ago

https://economictimes.indiatimes.com/small-biz/startups/newsbuzz/walmart-completes-due-diligence-for-its-big-flipkart-buy/articleshow/63644664.cms--

While using the newspaper library,Some articles on economictimes seem to give relevant content, whereas some don't. Can you help me figure out the error

codelucas commented 6 years ago

thanks for filing @kalhan123. can you please post different links, with the expected fulltext extract and the actual fulltext extract?

kalhan123 commented 6 years ago

sample text req-Walmart completed a thorough due diligence process on e-commerce firm Flipkart this week, two sources said, as the US retail giant looks to take a controlling stake of 51 percent or more in the Indian company.

Walmart has already floated a shareholder agreement, or offer proposal, and is looking to shell out about $10 billion to $12 billion for the stake that would value Flipkart at roughly $20 billion, one of the sources familiar with the matter said.

what i am actually getting- 'DID YOU KNOW?\n\nYou can save upto Rs. 46,350 with Tax Saving Mutual Funds'

although the library seems to work for this link- https://economictimes.indiatimes.com/industry/banking/finance/banking/some-icici-directors-opposed-to-chanda-kochhar-continuing-in-her-role-claims-report/articleshow/63679290.cms