MWATelescope / giant-squid

An alternative MWA ASVO client
Other
6 stars 1 forks source link

No "pick up where you left off" option for failed downloads #27

Open baron-de-montblanc opened 1 week ago

baron-de-montblanc commented 1 week ago

Hello, I am trying to download some rather large observations from ASVO to our group's supercomputer through giant-squid. It is very common for the download to fail (see attached screenshot for example), probably due to the connection getting interrupted.

Screenshot 2024-10-22 at 12 46 13 PM

My question is, is there an option/flag one can use with giant-squid to tell it to resume the download from where it crashed? (Or, alternatively, how could I successfully download these ~50Gb observations without it crashing?)

d3v-null commented 1 week ago

Hey Jade, That must be frustrating. We have a little bit of retry / error handling logic in giant-squid, but it's clearly not doing its job.

In the meantime, here's how you can use wget to handle the download instead.

giant-squid list --json $query

will give you a bunch of metadata about the jobs matching $query including a download link.

{
   "801409":{
      "obsid":1413666792,
      "jobId":801409,
      "jobType":"DownloadVisibilities",
      "jobState":"Ready",
      "files":[
         {
            "jobType":"Acacia",
            "fileUrl":"https://projects.pawsey.org.au/mwa-asvo/1413666792_801409_vis.tar?AWSAccessKeyId=...",
            "filePath":null,
            "fileSize":152505477120,
            "fileHash":"d6dfb7391a495b0eb07cc885808e9e8058e90ec3"
         }
      ]
   }
}

you can chuck fileUrl straight into wget, which has a lot of options around retrying downloads. I use --wait=60 --random-wait

If you want to automated this for many jobs you can use jq, e.g.

giant-squid list -j --states=ready -- $obslist \
    | jq -r '.[]|[.jobId,.files[0].fileUrl//"",.files[0].fileSize//"",.files[0].fileHash//""]|@tsv' \
    | while read -r jobid url size hash; do
    [ -f ${obsid}.tar ] && continue
    wget $url -O${obsid}.tar --progress=dot:giga --wait=60 --random-wait
done
gsleap commented 1 week ago

Hi Jade,

As Dev says, we currently don't have a continue-from-where-you-left-off feature as such, but it would be extremely valuable especially for large downloads. So it will definitely be on our roadmap for a future release.

In the meantime, I think Dev has used the above technique successfully, so please give that a go and let us know how it goes!

gsleap commented 1 week ago

oh and @baron-de-montblanc @d3v-null - FYI you can also pass to wget: -c, --continue to "resume getting a partially-downloaded file" I only just found it and does appear to work quite nicely!

d3v-null commented 1 week ago

A friendly reminder to anyone who comes across this issue: We take pull requests

The main download loop is here .

It's wrapped in an exponential backoff here .

Compared to wget , this is download handling from the stone age.