-
### GPT-3 data mix
* Datasets are not sampled in proportion to their size
* Datasets we view as higher-quality are sampled more frequently
* WebText2, Book1, Wikipedia datasets are sampl…
-
I'm not sure if it's an issue with the HTML of the website, if there's an issue parsing Tajiki, or something else, but I tried scraping http://www.jumhuriyat.tj/index.php?art_id=44635 on the Heroku de…
-
First off, thanks for the library @codelucas, it's awesome!
I am experiencing an exception on deployment to AWS Lambda, and deployment to Elastic Beanstalk results in a build failure. The cause is …
-
That'd be nice to be able to get the article text to stdout. It could look like
```
# prints article text, we can redirect to file with sh
newspaper https://www.cnn.com/2020/09/08/health/coronavi…
-
**Issue by [dividor](https://github.com/dividor)**
_Fri Apr 26 19:34:51 2019_
_Originally opened as https://github.com/codelucas/newspaper/issues/698_
----
Love this package, amazingly useful.
I…
-
I got this running in an Amazon Lambda function, and I wanted to share how I did it just in case it was useful for others.
[This gist](https://gist.github.com/JamesChevalier/d212c1998360520dd8a0e67cf…
-
**Issue by [ariel-frischer](https://github.com/ariel-frischer)**
_Sun Feb 2 02:04:58 2020_
_Originally opened as https://github.com/codelucas/newspaper/issues/776_
----
First off I would like to t…
-
Description
--------------
If the news article main content has some word, which is a link, then that word is skipped by newspaper library and you cannot see it in article.text .
For example, in th…
-
When I try it to install to Python3 many errors come up...is there any solution??
-
**Issue by [ekingery](https://github.com/ekingery)**
_Thu Jun 15 03:41:01 2017_
_Originally opened as https://github.com/codelucas/newspaper/issues/384_
----
First off, thanks for the library @code…