SciCrunch / sparc-curation

code and files for SPARC curation workflows
MIT License
14 stars 12 forks source link

pipelines breaking network sandbox e.g. PipelineExtras.added #28

Open tgbugs opened 4 years ago

tgbugs commented 4 years ago

https://github.com/SciCrunch/sparc-curation/blob/31f4d0a7af02103b6a251fe013b2a5cce47a5720/sparcur/pipelines.py#L951-L953

Make use of the new BlackfynnDatasetData source via spc rmeta to fetch the bf platform metadata and then merge that without having to hit the network.

There are a number of other sources that break this as well. Adding a pull phase for all of the remote sources seems like a good idea. In some cases there is a need for later network retrieval phases, e.g. protocols, but that is not the normal case.

tgbugs commented 9 months ago

this has been partially addressed by providing wrappers around network calls and raising errors if the sandbox is enabled, each case has to be handled explicitly so it is possible to accidentally reintroduce the behavior if a rogue call is added, but it at least helps hunt down issues, another possible approach to detecting issues would be to use something like the portage network sandbox to detect violations