Closed StuntsPT closed 5 years ago
Just for the sake of completeness, here are the first few lines of Gal02.clustS.gz
and Bot01.clust.gz
zcat Gal02.clustS.gz |head 0
b28abbacffa082292509cf3dcfeab9bd;size=13;*
TAAGCCCACTGGGGGAGGGTGTTAATGTAGAAGTGGCTTCTTCTTATGAGGATGTTTTGCAAGAGAGACATTTTTACATGCAGCTGCAGAGTAATATGTAGTTATGCCCTGAGT
//
//
81165659719a875b4bc8511160ad856a;size=13;*
CATACTGGGTCGCTCAATCATTGCTACTGTGTCCTTTATTCTAGCAGATCAAGATGATCAGGTGACAGCAAAAACTTCAAGGTTTTTAGCGGGAACAATTAGAATAAGATTTATTTGATT
//
//
c65b462470c083df934327dcd50f89e3;size=13;*
CACTTCCAGAGAACAGCACTGGCTGGACCCATGGATTTATGTTATGGAGTCCCACAGGGCTCCATCTTATCCCTCATGCTGTTCAATATCTACAGCAGGGGTGGCCAACTCCCA
zcat Bot01.clust.gz |head 0
>000018efa0c56c15b8211105139de92d;size=4;*
GCAGCCCCAGTTGTACTTTTAGACAAGCCTGATGGCTCTGTCAGATTTTGCATCGATTATAGAAAATTAAACCATGTCACTAAAGCGGACGCCTACCCAATGCCCCGCTTAGATGACCTT
>7302c28a504affebca3f7f9b2f8f54b0;size=2;+
AGTTCCTGGGCAGCCCCAGTTGTACTTTTAGACAAGCCTGATGGCTCTGTCAGATTTTGCATCGATTATAGAAAATTAAACCATGTCACTAAAGCGGACGCCTACCCAATGCCCCGCTTA
>ecd74b24946f0307c4f9b44d8ec96914;size=1;+
GTCCCTGGGCAGCCCCAGTTGTACTTTTAGACAAGCCTGATGGCTGTCAGATTTTTCATTGATTATAGAAAATTAAACCATGTCACTAAAGCGGACGCCTACCCAATGCCCCGCTT
//
//
>0000240af5d10f5fc3ff921e6d940847;size=6;*
CCACAGCCTAGGAATGGGTGGGGTGAGGGCAGGATATCCTAATGATCTTCTACCAATGACTTGGTGAAATAATTGGACAAAAAACCCAGTATGTGAGTTTAAAAATAATTAGCTCAAACC
Now that I am looking more closely, they are actually significantly different...
Are there any error messages in the ipyrad_log.txt file? What param settings are you using? Is this ls Tlep01_clust_0.85
run after step 3 completed? I'm curious because the htemp/utemp files for the failed sample aren't being cleaned up. Could indicate some problem. If you want to dropbox me the files for the Bot1 sample I can try to take a look at it. Also, if you re-run step 3 and include the -d
flag it'll write more info to the log file (if you DO NOT include the -f
flag then it'll only try to re-run those samples that previously failed step 3.
Hi @isaacovercast,
I will run with -d
and without -f
and post the results here.
Which files for Bot1
would you like?
All the files from the _clust directory, the file from the _fastqs directory. And the params file you used would be great.
On Fri, Jun 7, 2019 at 11:35 AM Francisco Pina-Martins < notifications@github.com> wrote:
Hi @isaacovercast https://github.com/isaacovercast, I will run with -d and without -f and post the results here. Which files for Bot1 would you like?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/345?email_source=notifications&email_token=ABNSXP73TITDJJCJ3IUH4GLPZJ55DA5CNFSM4HVXRCQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXGGAVQ#issuecomment-499933270, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNSXP4RHFRPHMMPNHDXLV3PZJ55DANCNFSM4HVXRCQQ .
Will do. I'd normally shrug this off as some problem with those samples, but the non-cleanup of tempfiles has tipped me off that something else might be at play. I'm not sure I'll be able to upload thing in time today, so expect the files by next Tuesday only. Sorry about that!
Hi @isaacovercast , sorry about the long wait. I have been preparing a "minimal" example for easy reproducibility. Here is a dropbox link, which contains:
fastq.gz
formatThe entire project directory
It was run using the following command:
ipyrad -p params-Bug345.txt -s 1234567 -c 16 -d
(debug mode activated)
Here is the STDOUT:
╭──francisco@Kakarotto [12:44] [~/Data_analyses/Tlep/bug_assembly] {ipyrad}
╰─$ ipyrad -p params-Bug345.txt -s 1234567 -c 16 -d 0
** Enabling debug mode **
-------------------------------------------------------------
ipyrad [v.0.7.30]
Interactive assembly and analysis of RAD-seq data
-------------------------------------------------------------
New Assembly: Bug345
establishing parallel connection:
host compute node: [16 cores] on Kakarotto
Step 1: Loading sorted fastq data to Samples
[####################] 100% loading reads | 0:00:10
2 fastq files loaded to 2 Samples.
Step 2: Filtering reads
[####################] 100% processing reads | 0:01:35
Step 3: Clustering/Mapping reads
[####################] 100% dereplicating | 0:00:24
[####################] 100% clustering | 1:02:59
[####################] 100% building clusters | 0:02:48
[####################] 100% chunking | 0:00:15
[####################] 100% aligning | 0:00:01
[####################] 100% concatenating | 0:00:08
no clusters found for Ses02
Step 4: Joint estimation of error rate and heterozygosity
skipping Ses02; not clustered yet. Run step3() first.
[####################] 100% inferring [H, E] | 0:00:01
Info: Sample Bot01 - No clusters have sufficient depth for statistical
basecalling. Setting default heterozygosity/error to 0.01/0.001.
Step 5: Consensus base calling
Skipping Sample Ses02; not yet finished step4
Skipping Sample Bot01; No clusters found.
Encountered an error (see details in ./ipyrad_log.txt)
Error summary is below -------------------------------
No samples to cluster, exiting.
The two things I find odd here: I have "grepped" the original fastq, and I have confirmed that some sequences are present more often than 4 times on the "Bot01" sample (where the maximum cluster seems to be 4). That makes the result of "no clusters found" extremely suspicious. Here is an example:
zgrep -c "GGGGAGACAGAGATTACATTGGCATGCAGTCAGCCGAGAAAATGCTCTTCCTTAATCTTAGAATTGTAGAGTTGGAAGGGACCATGAGGATCATCCCGTCCAACCCCCTGCAA" Bot01.fastq.gz
23
I have no idea why there is no Ses02.clustS.gz
file. Neither sample got the tempfiles cleared up.
I hope we can reach the bottom of this.
Oh, and BTW, this was all performed on a "clean" conda
environment with first having run export PYTHONNOUSERSITE=True
to make sure no system package is used.
Also, sorry about the huge size, but I wanted to make sure I didn't miss anything.
Hello Francisco, Thanks for sending this along. I ran the data and it works for me. I just re-ran the whole thing from step 1 with the -f flag and it looks fine. In your output it looks like there was a problem with the alignment step, since your alignment step finished far too quickly. This could be an indication of disk allocation issues (like you are running out of disk space and the alignment processes are dying). Are you sure you have enough disk space?
[image: image.png]
On Tue, Jun 18, 2019 at 11:22 AM Francisco Pina-Martins < notifications@github.com> wrote:
Hi @isaacovercast https://github.com/isaacovercast , sorry about the long wait. I have been preparing a "minimal" example for easy reproducibility. Here is a dropbox link https://www.dropbox.com/sh/5x7pkqrahvwmg9q/AAAxl6jDn_fM_Y-98xDJihwUa?dl=0, which contains:
- 2 samples in the original dereplicated fastq.gz format
- The entire project directory It was run using the following command: ipyrad -p params-Bug345.txt -s 1234567 -c 16 -d (debug mode activated) Here is the STDOUT:
╭──francisco@Kakarotto [12:44] [~/Data_analyses/Tlep/bug_assembly] {ipyrad}
╰─$ ipyrad -p params-Bug345.txt -s 1234567 -c 16 -d 0
Enabling debug mode
ipyrad [v.0.7.30]
Interactive assembly and analysis of RAD-seq data
New Assembly: Bug345
establishing parallel connection:
host compute node: [16 cores] on Kakarotto
Step 1: Loading sorted fastq data to Samples
[####################] 100% loading reads | 0:00:10
2 fastq files loaded to 2 Samples.
Step 2: Filtering reads
[####################] 100% processing reads | 0:01:35
Step 3: Clustering/Mapping reads
[####################] 100% dereplicating | 0:00:24
[####################] 100% clustering | 1:02:59
[####################] 100% building clusters | 0:02:48
[####################] 100% chunking | 0:00:15
[####################] 100% aligning | 0:00:01
[####################] 100% concatenating | 0:00:08
no clusters found for Ses02
Step 4: Joint estimation of error rate and heterozygosity
skipping Ses02; not clustered yet. Run step3() first.
[####################] 100% inferring [H, E] | 0:00:01
Info: Sample Bot01 - No clusters have sufficient depth for statistical basecalling. Setting default heterozygosity/error to 0.01/0.001.
Step 5: Consensus base calling
Skipping Sample Ses02; not yet finished step4 Skipping Sample Bot01; No clusters found.
Encountered an error (see details in ./ipyrad_log.txt)
Error summary is below -------------------------------
No samples to cluster, exiting.
The two things I find odd here: I have "grepped" the original fastq, and I have confirmed that some sequences are present more often than 4 times on the "Bot01" sample (where the maximum cluster seems to be 4). That makes the result of "no clusters found" extremely suspicious. Here is an example:
zgrep -c "GGGGAGACAGAGATTACATTGGCATGCAGTCAGCCGAGAAAATGCTCTTCCTTAATCTTAGAATTGTAGAGTTGGAAGGGACCATGAGGATCATCCCGTCCAACCCCCTGCAA" Bot01.fastq.gz
23
I have no idea why there is no Ses02.clustS.gz file. Neither sample got the tempfiles cleared up.
I hope we can reach the bottom of this. Oh, and BTW, this was all performed on a "clean" conda environment with first having run export PYTHONNOUSERSITE=True to make sure no system package is used. Also, sorry about the huge size, but I wanted to make sure I didn't miss anything.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/345?email_source=notifications&email_token=ABNSXP7K3PV67BQOZ75Z3HLP3EKWJA5CNFSM4HVXRCQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX7LHEY#issuecomment-503231379, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNSXP4XCBJ5VE2ZW24TTVDP3EKWJANCNFSM4HVXRCQQ .
That is wierd indeed. The drive the analyses are running on still has plenty of space left though (~150GB left). It's not that much, but really should suffice. But now that you mention space and a step being too fast... This is being run on a machine with an NVMe SSD. I will try to reproduce on a machine with an HDD. I will report back as soon as I am able to run this minimal example on our HPC, which has HDDs.
PS - Your attached image did not display on github
Hm, maybe that's it. If you're running on a very old spinning disk the pipe could overflow and this would cause all kinds of problems. Try it.
On Wed, Jun 19, 2019 at 4:10 PM Francisco Pina-Martins < notifications@github.com> wrote:
That is wierd indeed. The drive the analyses are running on still has plenty of space left though (~150GB left). It's not that much, but really should suffice. But now that you mention space and a step being too fast... This is being run on a machine with an NVMe SSD. I will try to reproduce on a machine with an HDD. I will report back as soon as I am able to run this minimal example on our HPC, which has HDDs.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/345?email_source=notifications&email_token=ABNSXP4JVGA3K4AJLB4HA2LP3KVELA5CNFSM4HVXRCQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYDNSUA#issuecomment-503765328, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNSXP2BANKYDZCJV6NS4KDP3KVELANCNFSM4HVXRCQQ .
Ok, confirming this: On out 2012 HPC, which has an HDD, everything runs as normal! Specs:
Intel Xeon E5-2609
SAS HDDs in RAID5
However, on my workstation, with an NVMe SSD, the error ocurrs. Specs:
AMD Ryzen 7 2700
Samsung 970EVO (SM981/PM981)
Do you have another machine with an SSD where you can confirm this? I will test on my home box which also has a similar SSD and see if I can reproduce the issue there.
If this was a problem with the low throughput of an HDD, it would be Ok on my book, but if it is an issue with the speed of SSDs, maybe this issue is worth pursuing?
Glad to hear it's working. SSD r/w should be faster than HDD across the board, so my suspicion is that it's something other than the drive. I've run ipyrad on HDD and SSD boxes dozens and dozens of times, so I suspect this is some weird edge case in some config aspect of your workstation in which case it's not really worth troubleshooting, imho, unless you feel really motivated. HW bugs can be a real head-ache to track down, though.
On Thu, Jun 20, 2019 at 4:54 AM Francisco Pina-Martins < notifications@github.com> wrote:
Ok, confirming this: On out 2012 HPC, which has an HDD, everything runs as normal! Specs:
Intel Xeon E5-2609 SAS HDDs in RAID5
However, on my workstation, with an NVMe SSD, the error ocurrs. Specs:
AMD Ryzen 7 2700 Samsung 970EVO (SM981/PM981)
Do you have another machine with an SSD where you can confirm this? I will test on my home box which also has a similar SSD and see if I can reproduce the issue there.
If this was a problem with the low throughput of an HDD, it would be Ok on my book, but if it is an issue with the speed of SSDs, maybe this issue is worth pursuing?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/345?email_source=notifications&email_token=ABNSXP3BY3AB7NYYXNWSEALP3NOV3A5CNFSM4HVXRCQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYFB6ZI#issuecomment-503979877, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNSXP4KZNKYUGP5S4Z7OK3P3NOV3ANCNFSM4HVXRCQQ .
Ok, I can reproduce this in my home machine too. Same error actually. Specs:
AMD Ryzen 5 2400G
Samsung 970EVO (SM981/PM981)
So this is not exclusive to a single system, but admittedly my home box and my workstation are rather similar. I think this can be closed for now, but I'll reopen it should I find out more.
Just as a reference, I can also reproduce on my laptop: Specs:
Intel i7-4700HQ
Samsung 840EVO
So far, in common these systems have: SSD; ArchLinux, up-to-date as of June 22 2019
I've been having a very weird issue with my
ipyrad
analysis: Specifically Step3. Some of the samples are not finishing the clustering.ipyrad
claims Step3 is finished and moves to Step 4, however in this step, it is stated that some samples have not been clustered.Under the
Tlep01_clust_0.85
directory I find this:Bot01_trimmed_merged
is one of the samples that Step 4 complains does not have any clusters, whileGal02
andGal03
are samples that "worked". The fileBot01.clust.gz
is very similar in structure toGal02.clustS.gz
, however,Bot01.clustS.gz
is empty (as you can probably assume by its size).What do you think can be the cause of this?