data-preservation-programs / singularity

Tool for onboarding data to the Filecoin Network
Other
18 stars 15 forks source link

[Feature Request]: singularity prep status with -l long list OR -v verbose #294

Closed distora-w3 closed 11 months ago

distora-w3 commented 1 year ago

What is the problem you're trying to solve?

Would like to see more detail with prep status (more info was available in singularity V1)

Current version: singularity v0.4.0-36ecfc5

Would like to see something like this:

singularity prep status 20230911-19
ID  Name            DeleteAfterExport  MaxSize      PieceSize
2   20230911-19     false              33822867456  34359738368
    Source Storages:
        ID  Name          Type   Path
        1   webak-upload  local  /mnt/blockstorage/testround4/01-upload
        ------------------------------------------
        ... file1
        ... file 2
        ... file 3
        -----------------------------------------

    Output Storages:
        ID  Name           Type   Path
        2   webak-prepout  local  /mnt/blockstorage/testround4/02-out
        --------------------------------------------
        ... CID 1    size xxxxx
        ... CID 2    size yyyyy
        ... CID 3    size zzzzz
        --------------------------------------------

Describe the workaround you currently have

.

Describe the feature you'd like

See example above. (Also in previous testing, some kind of progress bar / % complete)

Additional context

No response

xinaxu commented 1 year ago

That will crash the server and the terminal for large dataset. A better way to approach this is to offer a CLI to list the files inside a CAR/Job, so you can use a script to iterate through all jobs/CARs and print out all files. Though I will still question - why do you need to list the files in each CAR file. If you only need to know % complete, the list of job and their corresponding status would be enough.

distora-w3 commented 1 year ago

No, no. Not to list files "inside". This is more like singularity v1. Alternatively a count would work:

to prep: x   
prepped: y

Typically, I will open a terminal and run something like watch -n 30 singularity prep status [NAME] This is just a quick way to monitor progress with say a similar command for diskspace.

The aim is to get a view on the health of the processes and system resources. Simply put, you are making a visual scan of the healthy increase in the output files

distora-w3 commented 1 year ago

However, your CLI suggestion is not bad :) It is like the "find" command, which actually might have many more uses.

xinaxu commented 1 year ago

Typically, I will open a terminal and run something like watch -n 30 singularity prep status [NAME]

This is more like a command for monitoring. Let me think about it

Right now you can use some post processing tool, i.e. jq, bash, perl to get the aggregated stats with the current console output

watch -n 10 './singularity --database-connection-string "sqlite:test.db" --json prep status 1 | awk '\''BEGIN {RS=","}
/type/ {type=$2}
/state/ {state=$2; 
         gsub(/[^a-zA-Z0-9-]/, "", type);
         gsub(/[^a-zA-Z0-9-]/, "", state);
         count[type "-" state]++
        }
END {for (key in count) print key, count[key]}'\'''
xinaxu commented 1 year ago

I think I will make this feature ask to the web UI since the CLI is just quite limited and not very extensible.