nomad-coe / nomad

NOMAD lets you manage and share your materials science data in a way that makes it truly useful to you, your group, and the community.
https://nomad-lab.eu
Apache License 2.0
64 stars 14 forks source link

Feature request: Remove upload print limit for `nomad admin uploads ls` #102

Closed behnle closed 2 months ago

behnle commented 2 months ago

Current state: nomad admin uploads ls accepts three optional arguments: -e, --ids and --json.

Requested state:

The background for my request: I am operating an oasis that stores data from various groups in the same instance on an expensive redundant storage system. Hence i was asked to provide storage accounting metrics broken down to the group level on a monthly basis. Assigning user IDs to groups is no problem as the underlying keycloak is under my control. But apart from nomad admin uploads ls i currently see no straightforward way how to assign upload ids to users and i really do not see a point in reinventing the wheel. Of course i could hack the python code in nomad/cli/admin/uploads.py:474 but such a modification will precisely survive until the next update.

lauri-codes commented 2 months ago

Hi @behnle! Sounds like a reasonable request. I think we can certainly make some improvements here. I would suggest the following changes:

I can keep you updated here. If you want something more powerful, you can also get this information by querying MongoDB directly, but that is certainly trickier than using the CLI.

behnle commented 2 months ago

@lauri-codes Thanks for the reply. I would completely be fine with --json printing all information (would make parsing even easier), but unfortunately, it also just prints the IDs: https://github.com/nomad-coe/nomad/blob/9d50a49795b33ee2655a2516321bbab1e29a4b1b/nomad/cli/admin/uploads.py#L470-L472 and there is no way of changing this behaviour.

lauri-codes commented 2 months ago

Yes, Indeed I saw this as well and that is why I would suggest the change I mentioned above. Would make it easy for you to e.g. pipe it into a file.

behnle commented 2 months ago

For the time being, i figured out that

docker exec -it nomad_oasis_mongo mongoexport --db=nomad_oasis_v1 --collection=upload --pretty --fields="_id,main_author"

will give me the bare minimum of necessary information. I still need to figure out which key tells me whether i have to search in fs/staging of fs/public (probably published_to or current_process, but this is only nice-to-have.

lauri-codes commented 2 months ago

@behnle: The uploads ls command has now been updated based on our discussion here. The update will be immediately available in the develop branch + image, which you can use for testing things, but I would not recommend pinning your OASIS to use that image in general, since it can be unstable.

As to you other question regarding where the upload files are stored: published uploads will end up in the public folder, while un-published ones are in the staging folder. The publication status is given by upload.published.