Closed Jeltje closed 8 years ago
I doubt this has to do with --overwrite
. It's more likely a consistent bug in upload
.
Could you delete the destination file from cgl-driver-projects-encrypted and retry the upload with --debug
?
That corrects size (and so probably the issue):
s3am upload --debug --src-sse-key-file ./20160707.key --sse-key-file ./master.key --sse-key-is-master --download-slots 40 --upload-slots 40 --part-size 50M s3://cgl-inbox-su2c-ucsf/DTB-157-BL-T-DNA-HSmerge_S1.bam s3://cgl-driver-projects-encrypted/ucsf-pnoc/ucsf_issue52_input/DTB-157-BL-T-DNA-HSmerge_S1.bam 2>upload.err
http://hgwdev.cse.ucsc.edu/~jeltje/toil/upload.err
aws --region us-west-2 s3 ls --recursive --summarize cgl-driver-projects-encrypted | grep DTB-157-BL-T-DNA-HSmerge_S1.bam
2016-07-21 13:38:01 8769087566 ucsf-pnoc/ucsf_issue52_input/DTB-157-BL-T-DNA-HSmerge_S1.bam
Not sure what this tells you about the attempt to overwrite?
Not sure what this tells you about the attempt to overwrite?
Nothing. Just trying to rule out the possibility of a truncation issue with upload
.
Is it possible that the 8769003598 version of that file in the cgl-driver-projects-encrypted bucket was produced using either a different key or without --sse-key-is-master
?
Nope. Also, that wouldn not produce anything I could download later, I think.
Bam file goes in, truncated bam file comes out. I can do another --exists overwrite
with --debug
on a file I did not delete, if that helps?
Yes, that'd be great. Can you run
# upload local file
s3am upload --exists overwrite --debug --sse-key-file ./master.key --sse-key-is-master --download-slots 40 --upload-slots 40 --part-size 50M file:///some/local/file s3://cgl-driver-projects-encrypted/ucsf-pnoc/ucsf_issue52_input/DTB-157-BL-T-DNA-HSmerge_S1.bam
aws --region us-west-2 s3 ls --recursive --summarize cgl-driver-projects-encrypted | grep DTB-157-BL-T-DNA-HSmerge_S1.bam
# copy file from inbox again
s3am upload --exists overwrite --debug --src-sse-key-file ./20160707.key --sse-key-file ./master.key --sse-key-is-master --download-slots 40 --upload-slots 40 --part-size 50M s3://cgl-inbox-su2c-ucsf/DTB-157-BL-T-DNA-HSmerge_S1.bam s3://cgl-driver-projects-encrypted/ucsf-pnoc/ucsf_issue52_input/DTB-157-BL-T-DNA-HSmerge_S1.bam
aws --region us-west-2 s3 ls --recursive --summarize cgl-driver-projects-encrypted | grep DTB-157-BL-T-DNA-HSmerge_S1.bam
Just add output redirection as necessary.
That overwrites the file correctly
first upload:
2016-07-21 19:42:27 11724 ucsf-pnoc/ucsf_issue52_input/DTB-157-BL-T-DNA-HSmerge_S1.bam
second upload:
2016-07-21 19:46:18 8769087566 ucsf-pnoc/ucsf_issue52_input/DTB-157-BL-T-DNA-HSmerge_S1.bam
I tried to overwrite one of the other trouble files using the exact command I used before (same bash script), and it appears to work fine.
In other words, can't reproduce the issue.
Using
s3am (2.0a1.dev105)
for download,s3am (2.0a1.dev99)
for upload After moving bam files from a collaborator's inbox to our encrypted directory and running analysis, I noticed several truncated files. Asked the collaborator to re-upload, then rans3am upload --exists overwrite
to move the files. The sizes appeared to be the same between the two locations until I looked very closely with theaws client
:And indeed,
s3am download
from the encrypted location results in the same truncated file as before (VERY truncated, actually, only chrs M and 1 have reads). Downloading directly from the inbox gives a complete file. The downloaded file sizes are exactly as listed on S3.The exact copy command:
s3am upload --exists overwrite --src-sse-key-file ./20160707.key --sse-key-file ./master.key --sse-key-is-master --download-slots 40 --upload-slots 40 --part-size 50M s3://cgl-inbox-su2c-ucsf/DTB-157-BL-T-DNA-HSmerge_S1.bam s3://cgl-driver-projects-encrypted/ucsf-pnoc/ucsf_issue52_input/DTB-157-BL-T-DNA-HSmerge_S1.bam
Let me know if you want the collaborator key.