Open jrchudy opened 7 months ago
Looking at the hatrac REST-API doc, I see this line:
Note, there is no support for determining which chunks have or have not been uploaded as such tracking is not a requirement placed on Hatrac implementations.
Issue #1837 is related to this issue. 1837 resets the file upload job when a user logs back in after having their session expire. This would further improve that failure scenario but won't "fix" that issue. Ideally, to address 1837 we don't refresh the page after, but this issue will further improve that feature since other things can happen to refresh the page and force a restart.
Step 1 from the main message above has been merged. Moving this issue to "Scheduled" for implementing steps 2 and 3.
Using
recordedit
to upload files doesn't properly resume an upload if the connection to the server was lost or the window was refreshed. For instance, if a user is uploading a 200 MB file and only half of the file gets uploaded before an interruption, the user has to restart the upload process.We should properly "resume" the file upload if a partial file exists on the server already. This will be handled in multiple steps:
step 1 - resume on connection interruption
For resuming a file upload process that was interrupted in recordedit, the following should be done:
jobUrl
to ensure it is the same “path” that is being uploaded to when the upload process is attempting to resume.lastChunkIdx
- the index of the last chunk that was successfully uploadedjobUrl
- the hatrac namespace with the upload job appended to the endfileSize
- the size of the file initially uploaded to help ensure the resumed file is the same as the originaluploadVersion
- the final name for an upload job after the job is marked as completekey
in the map is intended to ensure each upload that is being resumed is for the same file (checksums match) being uploaded to the same column and recordedit form indexUploadFileObject
to be a partial uploadjobUrl
we trackedlastChunkIdx
represents some chunks have been uploadedfileSize
we trackeduploadVersion
is not set yet/hatrac/path/to/file.txt
without;upload/somehash
), if we get a 409 response assume the namespace already exists and thejobUrl
(/hatrac/path/to/file.txt;upload/somehash
) is used for the upload instead of creating a new upload jobWhen starting the upload job, the
lastChunkIdx
is used to notify which chunk to start uploading from so the job is properly resumed and we don’t upload any duplicate chunksMap for storing information about incomplete upload jobs
step 2 - when the page is reloaded
Other changes to accomplish this across reloads include:
More information that should be stored:
step 3 - resuming in a different tab/window
Other changes to accomplish this across multiple tabs/windows: