Closed FYYFU closed 4 years ago
I am not sure what the issue is, and I am not sure the person who wrote this code is available to debug. I recommend requesting the dataset through our form: https://cornell.qualtrics.com/jfe/form/SV_6YA3HQ2p75XH4IR
Thank you very much. I have download the dataset and get some information from the Newsroom website.
Hi, i tried to get the newsroom dataset from scratch. After i run the
newsroom-scrape --thin thin/dev.jsonl.gz --archive dev.archive
, thedev.archive
seems not apppear. It takes me 10 days to run the command and there is nothing left. So i wonder if there is something wrong to scrape from scratch?And the final result i get is: