Closed KillerCodeMonkey closed 7 years ago
You can disable downloading of images via fetch_images
. E.g.:
article = Article(url='http://cnn.com', fetch_images=False)
More here. I'm able to use newspaper that way in a docker container without installing any image libs so hopefully that works for you too.
i already set fetch_images to false.
But i get the error:
Unable to import module 'newspaper3k': cannot import name '_imaging'
okay i got it, now newspaper is found and but newspaper seems to create a data directory .newspaper_scraper
which is not on lambda:
START RequestId: 4fc82e64-b3de-11e7-baae-7b689a397efc Version: $LATEST
module initialization error: [Errno 2] No such file or directory: '/home/sbx_user1062/.newspaper_scraper'
@codelucas maybe it is possible to add an ENV-Variable to set base .news-scraper path?
created #462
Heyho,
thanks for the great url article scraping tool. But it would be nice to add the image downloading functionality as optional, so we can use your piece of sweet (code) cake without installing all the image libs (PIL, Pillow, libpng, ...).
Background:
it would be nice if i could use your lib on amazons aws lambda without building a custom package with all the needed packages, because you are not able to install system packages like libpng there.
Thanks!