DiSSCo / SDR

Specimen Data Refinery
Apache License 2.0
6 stars 0 forks source link

Create a Galaxy tool to download image #34

Closed benscott closed 2 years ago

benscott commented 2 years ago

Tools like Teklia's require a local file to read, so we need a tool that takes an image URI, downloads it to the server and updates the JSON with the local file path.

At this stage, we don't need to worry about #16 as throughput will be low - but this tool can be updated once the image server is needed/available.

PaulBrack commented 2 years ago

This one is two parts but to me asks a question: do we want to use conventional local file storage or do this a more Galaxy way? A Galaxy tool that just does a wget (or similar) and adds a location to the JSON should take 1-2 days max, but I think it's clunky. I'd question the merit of having the local path in the specimen data object - it's ultimately a temporary file. I'd prefer to pass the image itself as a separate Galaxy variable. What do you think?

@benscott

benscott commented 2 years ago

For the proof of concept, I think downloading to local file storage is fine. What would the Galaxy way of doing things entail? We have plans to use DISSCO CORDRA server moving forward - see #16 - so the image implementation is likely to change anyway. Let's just go for the quick and easy way of doing this.

The local file is essentially a full sized derivative, and more derivative if we want to perform any preproccessing before the next stage. Which might be likely - scale hi-res images etc., So I think it belongs in the JSON image, but as a derivative, not replacing the remote URL. Needs to be captured in #17.

llivermore commented 2 years ago

@PaulBrack we need a default directory for the image folder for #44. Image derivative path needs to be in the oDS data.

PaulBrack commented 2 years ago

Checked in with 97d3819be9964a80c974ef629f64f7808fcbb7d5