Closed mbacino closed 2 weeks ago
Hi @mbacino,
Do you see any error messages?
Can you basecall any of this data locally?
Can you explain what you mean by "could compressing the pod5 files be the issue?"?
Rich
I didn’t receive any error messages because the dorado job is stuck in the HPC job queue. My theory is that the pod5 files aren’t formatted correctly so dorado will not run. When I uploaded the pod5 files from my hard drive to the HPC I zipped the folder because each file is over a Gb and would take a long time to upload. Could compressing pod5 files cause them to become corrupted? I can’t run dorado locally because my computer isn’t powerful enough. Thanks, Margot
Get Outlook for iOShttps://aka.ms/o0ukef
From: Richard Harris @.> Sent: Tuesday, June 11, 2024 2:58:25 AM To: nanoporetech/dorado @.> Cc: Bacino, Margot @.>; Mention @.> Subject: Re: [nanoporetech/dorado] Pod5 files corrupted? (Issue #880)
Hi @mbacino, Do you see any error messages? If so - please share them so we can help identify the issue. Can you basecall any of this data locally? If so - It's unlikely an issue with pod5. Can you explain what you mean by "could compressing ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization.
ZjQcmQRYFpfptBannerEnd
Do you see any error messages?
Can you basecall any of this data locally?
Can you explain what you mean by "could compressing the pod5 files be the issue?"?
Rich
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/nanoporetech/dorado/issues/880*issuecomment-2160325323__;Iw!!LQC6Cpwp!r_mWgSmb2X4BOKMGi6Lib6UvKfEUwYlPpDWWor8jepK_o4qsihqZ2pXAo8CdS6WqODApFyY98MimWR60uj-O8MPYdarMUA$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ASLWF4EUDHBES34KPFYPETDZG3C4DAVCNFSM6AAAAABJDEJQE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRQGMZDKMZSGM__;!!LQC6Cpwp!r_mWgSmb2X4BOKMGi6Lib6UvKfEUwYlPpDWWor8jepK_o4qsihqZ2pXAo8CdS6WqODApFyY98MimWR60uj-O8MNPu5PkPw$. You are receiving this because you were mentioned.Message ID: @.***>
Assuming you unzipped the file on the other side that should be fine. Dorado cannot basecall zipped pod5s though.
Also - I'm surprised you gained much compression zipping pod5s. They're already efficiently compressed and I wouldn't have expected it to make much difference. In a local test I got 208M -> 207M.
Are you sure that your data has transferred correctly?
I can't really help you with your stuck job on UGE without more information. Contact your HPC admin to help recover some logs or information that would be helpful.
One of my pod5 files was corrupted and there was an error in my job submission script. Re uploading the pod5 files and editing my script resolved the issue.
Dorado basecaller dna_r10.4.1_e8.2_400bps_sup@v4.3.0 is stuck in queue of HPC. I uploaded pod5 files from my Minknow run to my university's HPC by zipping the pod5 folder and the unzipping it once it was in the correct directory. I am using a script that has previously worked with a different directory of pod5 files so I assume the issue is the input pod5 files. Could compressing the pod5 files be the issue? This is the script I am running 24_06_06
Pod5 file directory total 48G -rwx------. 1 ms-bacino lynch 1.3G Jun 3 13:57 FAZ22417_f962e43c_8ea0ac62_0.pod5 -rwx------. 1 ms-bacino lynch 2.9G Jun 3 13:57 FAZ22417_f962e43c_8ea0ac62_10.pod5 -rwx------. 1 ms-bacino lynch 2.5G Jun 3 13:57 FAZ22417_f962e43c_8ea0ac62_11.pod5 -rwx------. 1 ms-bacino lynch 2.9G Jun 3 13:58 FAZ22417_f962e43c_8ea0ac62_12.pod5 -rwx------. 1 ms-bacino lynch 2.3G Jun 3 13:58 FAZ22417_f962e43c_8ea0ac62_13.pod5 -rwx------. 1 ms-bacino lynch 2.8G Jun 3 13:58 FAZ22417_f962e43c_8ea0ac62_14.pod5 -rwx------. 1 ms-bacino lynch 2.7G Jun 3 13:58 FAZ22417_f962e43c_8ea0ac62_15.pod5 -rwx------. 1 ms-bacino lynch 2.1G Jun 3 13:58 FAZ22417_f962e43c_8ea0ac62_16.pod5 -rwx------. 1 ms-bacino lynch 2.4G Jun 3 13:58 FAZ22417_f962e43c_8ea0ac62_17.pod5 -rwx------. 1 ms-bacino lynch 2.4G Jun 3 13:59 FAZ22417_f962e43c_8ea0ac62_18.pod5 -rwx------. 1 ms-bacino lynch 2.0G Jun 3 13:59 FAZ22417_f962e43c_8ea0ac62_19.pod5 -rwx------. 1 ms-bacino lynch 2.9G Jun 3 13:57 FAZ22417_f962e43c_8ea0ac62_1.pod5 -rwx------. 1 ms-bacino lynch 795M Jun 3 13:59 FAZ22417_f962e43c_8ea0ac62_20.pod5 -rwx------. 1 ms-bacino lynch 2.5G Jun 3 13:59 FAZ22417_f962e43c_8ea0ac62_2.pod5 -rwx------. 1 ms-bacino lynch 3.2G Jun 3 13:59 FAZ22417_f962e43c_8ea0ac62_3.pod5 -rwx------. 1 ms-bacino lynch 3.3G Jun 3 13:59 FAZ22417_f962e43c_8ea0ac62_4.pod5 -rwx------. 1 ms-bacino lynch 2.6G Jun 5 09:01 FAZ22417_f962e43c_8ea0ac62_5.pod5 -rwx------. 1 ms-bacino lynch 3.1G Jun 5 09:21 FAZ22417_f962e43c_8ea0ac62_6.pod5 -rwx------. 1 ms-bacino lynch 3.0G Jun 5 09:41 FAZ22417_f962e43c_8ea0ac62_7.pod5 -rwx------. 1 ms-bacino lynch 2.6G Jun 5 10:01 FAZ22417_f962e43c_8ea0ac62_8.pod5