lamalab-org / matextract-book

http://matextract.pub/
MIT License
24 stars 1 forks source link

make reaction datasets less annoying to get #55

Closed kjappelbaum closed 3 months ago

kjappelbaum commented 3 months ago

without git clone etc.

raised by @fekad

kjappelbaum commented 3 months ago

simplest would be to put them on Huggingface

kjappelbaum commented 3 months ago

@MrtinoRG do you think we should put it on HuggingFace?

kjappelbaum commented 3 months ago

I assigned you @MrtinoRG as you added the notebook, but if you have too many other things on your plate, I can also look into it a bit later this week

MrtinoRG commented 3 months ago

Which datasets would you like to include? I imagine is the USPTO-ORD-100K dataset but I am not sure

kjappelbaum commented 3 months ago

oh - I would use the same you have now, but just put the files on HuggingFace instead of obtaining them from git.

Alternatively, I'd have some util or use pystow to just wget them