Open adswa opened 2 years ago
I feel like this might be a problem specifically with the newly added 7T data. I'm seeing the same thing for some other randomly selected subjects with 7T data 100610
, 148133
:
<sub-id>/MNINonLinear/Results/tfMRI_MOVIE1_7T_AP/*.txt
This is unrelated to the 7T data, but we just found 7 subjects with a similar error pattern:
{"url":"HCP_1200/212419/MNINonLinear/Results/tfMRI_RELATIONAL_LR/tfMRI_RELATIONAL_LR.nii.gz","size":194996822,"lastmodified":"2018-08-21T01:13:13.000Z","target":"MNINonLinear//Results/tfMRI_RELATIONAL_LR/tfMRI_RELATIONAL_LR.nii.gz"}
{"url":"HCP_1200/401422/MNINonLinear/Results/tfMRI_SOCIAL_RL/tfMRI_SOCIAL_RL_Atlas.dtseries.nii","size":100679392,"lastmodified":"2018-08-22T09:35:39.000Z","target":"MNINonLinear//Results/tfMRI_SOCIAL_RL/tfMRI_SOCIAL_RL_Atlas.dtseries.nii"}
{"url":"HCP_1200/638049/MNINonLinear/Results/rfMRI_REST1_RL/rfMRI_REST1_RL.R.native.func.gii","size":731096961,"lastmodified":"2018-08-23T13:44:52.000Z","target":"MNINonLinear//Results/rfMRI_REST1_RL/rfMRI_REST1_RL.R.native.func.gii"}
{"url":"HCP_1200/884064/unprocessed/3T/tfMRI_MOTOR_RL/884064_3T_tfMRI_MOTOR_RL.nii.gz","size":298915217,"lastmodified":"2018-08-25T01:07:22.000Z","target":"unprocessed//3T/tfMRI_MOTOR_RL/884064_3T_tfMRI_MOTOR_RL.nii.gz"}
{"url":"HCP_1200/886674/MNINonLinear/Results/tfMRI_LANGUAGE_LR/tfMRI_LANGUAGE_LR.R.native.func.gii","size":226305504,"lastmodified":"2018-08-25T01:24:59.000Z","target":"MNINonLinear//Results/tfMRI_LANGUAGE_LR/tfMRI_LANGUAGE_LR.R.native.func.gii"}
{"url":"HCP_1200/894067/MNINonLinear/Results/tfMRI_EMOTION_RL/tfMRI_EMOTION_RL.R.native.func.gii","size":121684503,"lastmodified":"2018-08-25T02:29:23.000Z","target":"MNINonLinear//Results/tfMRI_EMOTION_RL/tfMRI_EMOTION_RL.R.native.func.gii"}
{"url":"HCP_1200/901139/MNINonLinear/Results/rfMRI_REST2_RL/rfMRI_REST2_RL.L.native.func.gii","size":735529336,"lastmodified":"2018-08-25T04:13:08.000Z","target":"MNINonLinear//Results/rfMRI_REST2_RL/rfMRI_REST2_RL.L.native.func.gii"}
The problem was a silent and undetected failure of addurls
(more to the story is in #33)
Sorry to reply to an old issue - do you know if all the data in these subdatasets have available copies somewhere?
Yes. For example at https://hub.datalad.org/hcp-openaccess
I meant the data within these datasets. I'm seeing a lot of
Results/tfMRI_RETCW_7T_PA/Movement_AbsoluteRMS.txt (file) [not available; (Note that these git remotes have annex-ignore set: origin)]
messages when trying to datalad get
some of these.
I have been updating the subsampled datasets that derive from this large dataset and can also be found under this organization. This brought to light that there are a number of files in the dataset that can't be retrieved. This number is quite small compared to the overall number of files, but worthy of investigating. We should make sure that these files indeed were removed from the bucket and remove them from the datasets too, or, if they actually are available, figure out what went wrong with adding their URLs.
There is a single one in https://github.com/datalad-datasets/hcp_smoothedmyelin/pull/2 (this is a newly added file):
In https://github.com/datalad-datasets/hcp-functional-connectivity there seem to be some systematic failures:
I see this pattern of files without a known copy for a few, but not all subjects in the dataset. A few example subjects where the failure occurs are
118225
and782561
. The subjects where I don't see it do not seem to contain these files in the first place. One example is subject987074
. Does any of this ring a bell? Ping for awareness @mih @loj