Closed jacobdanovitch closed 4 years ago
Absolutely! I’ll merge ASAP!
Thank you for checking !!
No problem. I haven't used this before so I'm not sure what it previously downloaded; was it only looking for the .tar.gz
files? If so, I'll make that the default file filter.
So it was rather poorly coded up till now, currently it simply grabbed the last 10 files from the data because I didn’t expect that to change :) bulk of the data is stored in gz files so I think that’s a reasonable filter
PR opened! I've been testing it in Colab so it could probably use a quick test just in case, but it should work for the listed cases. Only additional library used is re
.
Merged! I’ll fiddle around with it locally for a bit before it goes on pypi
Yeah looks good to me, I am going to go ahead and package! Thank you!!!
Tried to run
download("data")
this morning and only got 10 JSON files from last night's biorxiv update. It seems as though about 950 JSON files got uploaded without a folder, which breaks the current download function.I rewrote it to handle this as well as to allow people to specify which files they'd like to download (either matching a regex or containing a substring). Should I open a PR?