Closed ltetrel closed 2 years ago
@tsalo As you can see here: https://github.com/SIMEXP/Repo2Data/blob/4d09e23b3966b505da29296e04060148c3516f7f/repo2data/repo2data.py#L127-L141 I am checking if a data_requirement file already exists, and if its content match the target user requirement file it will bypass the download, so obviously no internet access is required if already downloaded.
Did you change the data_requirement.json
file ?
Can you send me the layout of your downloaded directory, content of downloaded data_requirement.json
and target data_requirement.json
?
I am also trying on my end with your requirement.
There is an error when downloading with osfclient
, it might be a timeout issue.
Were you able to download with repo2data
entirely without error ? Unfortunately if there is an issue with the osf
fetcher or with your data there is nothing more I can do...
File "/srv/conda/envs/notebook/bin/osf", line 8, in <module>
sys.exit(main())
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/osfclient/__main__.py", line 104, in main
exit_code = args.func(args)
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/osfclient/cli.py", line 91, in wrapper
return_value = f(cli_args)
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/osfclient/cli.py", line 167, in clone
file_.write_to(f)
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/osfclient/models/file.py", line 57, in write_to
int(response.headers['Content-Length']))
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/requests/structures.py", line 54, in __getitem__
return self._store[key.lower()][1]
KeyError: 'content-length'
Traceback (most recent call last):
File "/srv/conda/envs/notebook/bin/repo2data", line 60, in <module>
main()
File "/srv/conda/envs/notebook/bin/repo2data", line 57, in main
repo2data.install()
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/repo2data/repo2data.py", line 75, in install
ret += [Repo2DataChild(self._data_requirement_file, self._use_server).install()]
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/repo2data/repo2data.py", line 249, in install
self._scan_dl_type()
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/repo2data/repo2data.py", line 238, in _scan_dl_type
self._osf_download()
File "/srv/conda/envs/notebook/lib/python3.7/site-packages/repo2data/repo2data.py", line 211, in _osf_download
, self._dst_path])
File "/srv/conda/envs/notebook/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['osf', '--project', 't8h9c', 'clone', './data/nimare-paper']' returned non-zero exit status 1.
I got the same error when trying to download. I previously only started the download and cancelled when I thought it would run through successfully- but I was clearly wrong. It failed after ~30 minutes, I think, so it didn't go for the full hour.
I'm wondering if it would make more sense to either (1) switch to just using the googledrive command instead of OSF or (2) zip everything in a single file? I could check for the data folder and zipped file at the beginning of each book script? It won't be pretty but it might at least work...
This is what I would actually suggest yes, to try with gdrive
. Also zip can indeed help with the download here, however repo2data will unzip just if it is the top-folder that is archived (i.e. in your case the googledrive
folder). https://github.com/SIMEXP/Repo2Data/blob/4d09e23b3966b505da29296e04060148c3516f7f/repo2data/repo2data.py#L105-L107
I wanted first to check the whole directory content to unzip one-by-one each file but I was afraid it would unzip too much (like .nii.gz
for example) and takes too much time.
Also I saw that in osf there was like a notice saying that they experience lot of spam. You may want to check with the admins just in case:
Thanks! I've zipped the data files into a single file and uploaded that to Google Drive, and then I replaced the OSF repo URL in the data requirement file to the Google Drive file URL. I just submitted to RoboNeuro again. 🤞
EDIT: I started a build locally (had to stop because my laptop can't run all of the analyses) and the data files looked good. The compressed files within the data folder (e.g., .nii.gz
files) were still compressed.
I will close this issue since repo2data had the good behavior (tries to re-download because it failed)
@tsalo