Closed alanakbik closed 3 months ago
@alanakbik Thanks for reporting this.
Sadly I couldn't reproduce the exact error. I get the same message but Fundus seems to be installed anyway.
Just to be clear: Could you run this code snippet after a fresh pip install fundus
from fundus import Crawler, PublisherCollection
async def crawl():
crawler = Crawler(*PublisherCollection.de)
async for article in crawler.crawl_async(max_articles=10, only_complete=False):
print(article)
await crawl()
While investigating this I found several other problems regarding notebooks.
only_complete
is bugged and won't crawlUpdate: You don't have to wrap this within an async function. Colab seems to be fine with you doing this:
from fundus import Crawler, PublisherCollection
crawler = Crawler(*PublisherCollection.de)
async for article in crawler.crawl_async(max_articles=10):
print(article)
Update: I posted the wrong script and updated it.
Yes, can confirm that your snippet runs for me!
@alanakbik We changed the version restrictions for our dependencies so installing Fundus within google colab should no longer yield an error. The new release may take some while but the changes are already live on master so using
pip install git+https://github.com/flairNLP/fundus
should do the trick for now.
Thanks, it now installs without problem!
But both these snippets still don't work for me:
from fundus import CCNewsCrawler, PublisherCollection
crawler = CCNewsCrawler(*PublisherCollection.de)
for article in crawler.crawl(max_articles=10, only_complete=False):
print(article)
and
from fundus import PublisherCollection, Crawler
# initialize the crawler for news publishers based in the US
crawler = Crawler(PublisherCollection.us)
# crawl 2 articles and print
for article in crawler.crawl(max_articles=2):
print(article)
but I think this is handled in different PRs?
Regarding the snippets:
crawl_async
. I updated the code snippet above. Unfortunately, I originally posted the wrong one (the one referencing the CCNewsCrawler).
Describe the bug
I tried running the tutorial code in a fresh colab environment, but when running
the installation fails with the output
How to reproduce
Expected behavior.
It installs correctly and I can run all tutorials on Google Colab.
Logs and Stack traces
No response
Screenshots
No response
Additional Context
No response
Environment