Planteome / planteome-annotation-data

This is a place to discuss issues around the Planteome annotation data and store useful scripts etc.
1 stars 0 forks source link

Download most recent version of all .assoc files in bulk #29

Closed serenalotreck closed 2 years ago

serenalotreck commented 2 years ago

I'm looking to be able to scrape/download in bulk the most recent versions of all the .assoc files from the data repository. Is there a good way to do this that doesn't involve me having to copy-paste all the file names into a file to use with a bash/python script that iteratively downloads each file?

Thanks!

elserj commented 2 years ago

It is a svn repo, and it is set for anonymous checkout (but not commits), so you can do a: svn co http://palea.cgrb.oregonstate.edu/svn/associations

Or it should be possible to do it with wget, https://unix.stackexchange.com/questions/117988/wget-with-wildcards-in-http-downloads Limited testing, but this appears to work: wget -r -nH --cut-dirs=2 -l2 -np "http://palea.cgrb.oregonstate.edu/svn/associations" -A "*.assoc"

serenalotreck commented 2 years ago

The wget command worked, thanks so much!