Closed kjappelbaum closed 3 months ago
simplest would be to put them on Huggingface
@MrtinoRG do you think we should put it on HuggingFace?
I assigned you @MrtinoRG as you added the notebook, but if you have too many other things on your plate, I can also look into it a bit later this week
Which datasets would you like to include? I imagine is the USPTO-ORD-100K
dataset but I am not sure
oh - I would use the same you have now, but just put the files on HuggingFace instead of obtaining them from git
.
Alternatively, I'd have some util or use pystow to just wget
them
without git clone etc.
raised by @fekad