Open mshadbolt opened 4 years ago
@clairerye can you please prioritise this?
If this is the only tool we have to transfer files to an upload area at the moment then it needs to be fixed as a high priority but if there is a work around we can use in the mean time that can be communicated to operations then it can wait.
As far as I am aware this is the way we want to be doing this rather than using the old CLI. @MightyAx or @prabh-t. It looks like a bug, or at least its not performing the required and expected behaviour so I agree we need to get it fixed asap. @MightyAx are you happy to take a look at this as soon as possible?
I've taken a quick look and was waiting for the issue to be prioritised. Can continue to investigate further or also happy to leave it to @MightyAx to do.
If you have already started, maybe its easier for you to carry on? It would be good to make sure that you aren't the only one familiar with it though. I will leave the decision to you @prabh-t and @MightyAx. In the meantime do you suggest @mshadbolt uses the hca cli or waits? I think she is working to a fairly tight time frame.
If this can wait until end of day/early morning tomorrow, i'd suggest we wait. I will have to run the sync
command against this upload area to try reproducing the issue, as it seems to do with this particular dataset. I may end up transferring the data to where it needs to be in the process, if that's OK.
@mshadbolt Is that reasonable? If you were wanting to submit this today, I think using the hca cli is your best option. I will leave you to co-ordinate with @prabh-t as my access will be limited for the rest of the day, sorry.
Has anyone used the command successfully on a real dataset?
sure. I can use the hca cli.
hca cli is working perfectly so I don't think there is anything wrong with the files
hi marion, i've identified the issue. The command takes a while initially (roughly 10 mins) as it tries to gather metadata for each object (600+) before running the sync in parallel. I've changed that to happen in the thread now so you're not blocked at the start and you see the progress/something happening. And the reason why some files were transferred others not was to do with wrong determination of content type for certain files. These are fixed now and i'm going to update the package after running the tests to be sure nothing else breaks. This is going to happen a bit later in the night, so if you want me to do the transfer, let me know.
v0.2.8
addresses the above issues. It also addresses issue with individual file progress with a single overall progress bar as was discussed on slack. (the upload and download command still use the old progress indication which stays problematic with large number of files.) I have added a release note to the code repo here and I have a draft PyPi package release SOP here. It'd be great to have your input on. @lauraclarke @clairerye @mshadbolt
When you get the chance, can I have edit access please so I can make a few comments/suggestions. But this looks like a great plan overall. @ami-day when you are ready, are you able to start the testing in ticket #235
I believe this ticket can be moved to Done / In Production / Closed
I attempted to use the hca-util sync command to transfer files from a hca-util upload area to an ingest submission upload area.
It hung for a long time then eventually said 'transferring' Then all the transfers failed
etc