datalad / datalad-registry

MIT License
0 stars 2 forks source link

Failures in dataset population #93

Open candleindark opened 1 year ago

candleindark commented 1 year ago

Datalad registry instance failed to handle population of datasets listed on https://github.com/datalad/datalad-usage-dashboard intermittently.

The population of Datalad registry instance using the first 10 and first 20 active datasets on GitHub listed on https://github.com/datalad/datalad-usage-dashboard failed intermittently. Individual failures are unique in the sense that individual population attempt failed at gathering information about different datasets.

Resolving https://github.com/datalad/datalad-registry/issues/86 would be helpful in location the cause of this issue.

candleindark commented 1 year ago

One observation is that the failure of one submission of urls to a Datalad registry instance using registry_submit_urls doesn't seem to affect the result of another submission of urls to the same instance using registry_submit_urls.

candleindark commented 1 year ago

Another observation is that the failure of population of dataset info can fail for a subsequent submission of URLs using registry_submit_urls to a Datalad registry instance as well as the first submission to the Datalad registry instance.

yarikoptic commented 1 year ago
yarikoptic commented 1 year ago

134 and #135 might be of this kind but more specific. #133 might provide an instrumentation to ease debugging.

Do you want to keep this issue open @candleindark or should we consider more specific to be replacement to it?

candleindark commented 1 year ago

I think we should keep this open for now. #134 and #135 might be manifestations of this problem but this problem can possibly appear in a different way.