Is it possible in some way to define what language the news is in, so it could be fetched correctly?
I used the library for a news in Portuguese, but it converted "special letters" to regular ones.
It highly compromises NLP procedures that deals with syntax, context etc.
example: "àáéóíúâôêãõç" is converted to "aaeiuaoeaoc"
from newsfetch.news import newspapernews = newspaper('https://g1.globo.com/sc/santa-catarina/noticia/2021/01/20/greve-na-comcap-coleta-feita-por-empresa-privada-em-florianopolis-vai-abranger-35percent-do-roteiro-diz-prefeitura.ghtml')
I saw inside the class it is used Newspaper3K Scraper and if I enforce the right language it returns the correct text.
from newspaper import Articlearticle = Article(url, language='pt')
Hello
Is it possible in some way to define what language the news is in, so it could be fetched correctly? I used the library for a news in Portuguese, but it converted "special letters" to regular ones. It highly compromises NLP procedures that deals with syntax, context etc.
from newsfetch.news import newspaper
news = newspaper('https://g1.globo.com/sc/santa-catarina/noticia/2021/01/20/greve-na-comcap-coleta-feita-por-empresa-privada-em-florianopolis-vai-abranger-35percent-do-roteiro-diz-prefeitura.ghtml')
I saw inside the class it is used Newspaper3K Scraper and if I enforce the right language it returns the correct text.
from newspaper import Article
article = Article(url, language='pt')
thank you