SciCatProject / scitacean

High level Python API for SciCat
https://scicatproject.github.io/scitacean/
BSD 3-Clause "New" or "Revised" License
1 stars 3 forks source link

SciCat profiles #37

Open jl-wynen opened 1 year ago

jl-wynen commented 1 year ago

This is a suggestion for making the setup of clients more user friendly.

Description

Add 'profiles' (name up for debate) that define an API URL, file transfer, and potentially more, such that we can make a client using

# builtin profile
client = Client.from_token(profile='ess', token=...)

# from a file
client = Client.from_token(profile='my-profile.toml', token=...)

# programmatically
profile = Profile(url=..., file_transfer=...)
client = Client.from_token(profile=profile, token=...)

and the profile would be along these lines:

url = "https://ess.scicat.eu/api/v3"

[[file_transfer]]
type = "link"

[[file_transfer]]
type = "ssh"
host = "login.dmsc.dk"
remote_base_path = "/ess/data"

So it would define a client that talks to the production instance at ESS. And for files, it would first attempt to symlink files if we have direct access to the file system (needs to be implemented separately) and if that is not possible, it uses SSH.

Users, maintainers, or admins at other facilities can then write their own profiles and either integrate them into Scitacean or provide them in a different way.

Further attributes can be added if need be (e.g. how to authenticate with the file server)

This would not replace the current mechanism but would be an alternative.

Benefits

Drawbacks

jl-wynen commented 1 year ago

Concerning file transfers, there are additional options for finding out what to do:

bpedersen2 commented 9 months ago

I would prefer to add these options to the scicat backend in the dataset, as the storage can be different depending on the dataset ( we plan to store smaller dataset in S3, available via an https-broker, direct s3-access being an alternative option, and large dataset ( >>1TB) in another storage system). So enhancing the way the data access is stored in scicat is proably the way to go.

jl-wynen commented 9 months ago

Interesting. How are you going to handle this information? Does the user have to specify it during upload or will it be assigned automatically by the backend? If it's the latter, then the client (Scitacean) still needs to know how to upload the files.