galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.42k stars 1.01k forks source link

Allow a posix file source to prefer linking. #19132

Closed jmchilton closed 2 weeks ago

jmchilton commented 2 weeks ago

The linking upload parameters will still be respected, but if none of them are set data fetch will default to just linking files during upload. This uses Dataset.external_filename instead of symlinks in the objectstore so that Galaxy has better tracking of the links and so this works closer to the way data libraries have always worked.

Alternative to https://github.com/galaxyproject/galaxy/pull/19125.

How to test the changes?

(Select all options that apply)

License

guerler commented 2 weeks ago

Thank you @jmchilton. I tried this and ran into an issue. The upload works fine and the dataset looks good but when attempting to run a job with that linked dataset the tool cannot find the input file.

jmchilton commented 2 weeks ago

I've added a test case for running a tool afterward and it works fine... I'm going to have to gulp... use Galaxy... aren't I?

jmchilton commented 2 weeks ago

It worked fine for me. Details below.

Did this not work on your laptop or on a cluster? Can you confirm the external_filename points at a valid path after the upload? Are any parent directories of the external_filename symbolic links - I've had problems with /private/tmp vs /tmp when using Docker for instance.

Uploaded Dataset:

Screenshot 2024-11-13 at 10 09 48 AM

After running a tool on it:

Screenshot 2024-11-13 at 10 10 03 AM

Config:

- id: home_directory
  label: Home Directory
  doc: Your Home Directory on this System
  type: posix
  root: "/Users/jxc755/"
  prefer_links: true