Closed brancomat closed 1 year ago
Let's work out how it works first, then where to document it.
In theory, if you have somethign like datasets/lami123/.archive/last/2022/2022-12.grib
, you can do this:
arki-check --unarchive 2022/2022-12.grib datasets/lami123
And this should move that segment into datasets/lami123/2022/2022-12.grib
, and index it as part of the online dataset.
Does this match the behaviour you observe?
Does this match the behaviour you observe?
yes. My question is if in the current implementation is possible to specify more than one file (directories or wildcard).
Side note: I tried a couple of things (admittedly, not very clever) that had an unexpected impact on lock file creation in the $dataset/$year
directory (in this example: cosmo/2022
), I don't know if it could be considered a bug:
$ ls cosmo/2022/
$ arki-check --unarchive 2022/\*.grib cosmo/
Traceback (most recent call last):
File "/usr/bin/arki-check", line 11, in <module>
main()
File "/usr/bin/arki-check", line 7, in main
sys.exit(Check.main())
File "/usr/lib/python3.10/site-packages/arkimet/cmdline/base.py", line 83, in main
return cmd.run()
File "/usr/lib/python3.10/site-packages/arkimet/cmdline/check.py", line 133, in run
arki_check.unarchive(pathname=self.args.unarchive)
RuntimeError: cannot rename /home/dbranchini@ARPA.EMR.NET/Scaricati/arkitest/cosmo/.archive/last/2022/*.grib to /home/dbranchini@ARPA.EMR.NET/Scaricati/arkitest/cosmo/2022/*.grib: No such file or directory
$ ls cosmo/2022/
'*.grib.lock'
$ arki-check --unarchive 2022/* cosmo/
Traceback (most recent call last):
File "/usr/bin/arki-check", line 11, in <module>
main()
File "/usr/bin/arki-check", line 7, in main
sys.exit(Check.main())
File "/usr/lib/python3.10/site-packages/arkimet/cmdline/base.py", line 83, in main
return cmd.run()
File "/usr/lib/python3.10/site-packages/arkimet/cmdline/check.py", line 133, in run
arki_check.unarchive(pathname=self.args.unarchive)
RuntimeError: cannot auto-detect format from file name 2022/*: file extension not recognised
$ ls cosmo/2022/
'*.grib.lock' '*.lock'
Right, yes, I see I have work to do to make it not just working, but also useable.
It makes sense to make it take segment names, and infer datasets from them.
I'll work on this
In the issue297
branch there's a version of arkimet that adds the arki-maint
command. arki-maint
allows subcommands, and it currently only has the unarchive
subcommand, which works like this:
arki-maint unarchive dataset/.archive/last/2022-*.grib
It will look for .archive/last
in each of its arguments, infer the dataset directories and the segment names from that, and do the equivalent of running arki-check
on each dataset and on each segment.
I did quite a bit of refactoring in command line parsing code to be able to share code between normal commands and commands with subcommands, that's why I'm pushing to a separate branch and not to master
There's no mention of the
--unarchive
option in https://arpa-simc.github.io/arkimet/datasets/archive.html (or in any other part of the doc).The only mention I found is in the help and in the man page of
arki-check
This is a bit misleading since by trial and error it seems that it accepts only specific filenames (no paths, no wildcards). Is this correct?