Closed timsehn closed 3 years ago
I was only able to get content for ~3,400 articles using the provided scraper:
timsehn$ dolt sql -q "select count(*) from articles where article_content != 'Exception'"
+----------+
| COUNT(*) |
+----------+
| 3394 |
+----------+
Hi Curation,
This is Tim, the CEO of the company that built Dolt and DoltHub. Dolt is git semantics wrapped on top of a SQL database and DoltHub is a place to share those databases. We think this dataset makes a lot of sense on DoltHub.
I took the liberty of importing it (even with the scraped articles):
https://www.dolthub.com/repositories/Liquidata/curation-corpus
We thought Dolt might be an interesting tool for you to check out.
--Tim