Open gigjozsa opened 2 years ago
Hi Josh, you can have a look at the mvf_copy.py
script in the scripts
directory. This also allows some rudimentary filtering of the data to avoid copying the data you don't want (I'm still busy expanding the filtering options).
One downside of the script is that it cannot continue with a partial copy after a crash, unlike mvftoms
and wget / curl
, as illustrated in the diagram below:
Another option is rclone. I've used this on our own cluster machines with good success but I still have to figure out a suitable formula when using token authentication for external access.
Also, be aware that you don't get a single HDF5 file like KAT-7 produced, but a directory with hundreds (or thousands) of NPY files, as well as an RDB file as point of entry. This is our chunked "MVF4" format.
I'll see if I can get rclone
to work, and improve mvf_copy.py
as well in the meantime.
I haven't found a method yet to dump an hdf5 file as read onto a local disk. So, read a file from the archive with katdal.open, then dump it on the disk as is, to then read it again with katdal.open . If you have a local copy, this makes things much faster if you have to repeat them. If there is such method, I'd appreciate a hint, if not, it might be good to implement it.