Open nextgenusfs opened 7 years ago
I will mark your email as unread. I am away and my internet is extremely limited and intermittent.
On Dec 21, 2016 02:02, "Jon Palmer" notifications@github.com wrote:
On Mac, seems to have compiled correctly. But get error on the test data, I don't think it is permissions issue.
src/deML -i testData/index.txt -f testData/todemultiplex.fq1.gz -r testData/todemultiplex.fq2.gz -if1 testData/todemultiplex.i1.gz -if2 testData/todemultiplex.i2.gz -o testData/ Conflicts for index1: AGTCAGA from RG9 causes a conflict with RG57 AACTAGA from RG10 causes a conflict with RG58 CTATGGC from RG11 causes a conflict with RG59 CGACGGT from RG12 causes a conflict with RG60 AACCAAG from RG13 causes a conflict with RG61 CGGCGTA from RG14 causes a conflict with RG62 GCAGTCC from RG15 causes a conflict with RG63 CTCGCGC from RG16 causes a conflict with RG64 CTGCGAC from RG17 causes a conflict with RG65 ACGTATG from RG18 causes a conflict with RG66 ATACTGA from RG19 causes a conflict with RG67 AGTCAGA from RG57 causes a conflict with RG9 AACTAGA from RG58 causes a conflict with RG10 CTATGGC from RG59 causes a conflict with RG11 CGACGGT from RG60 causes a conflict with RG12 AACCAAG from RG61 causes a conflict with RG13 CGGCGTA from RG62 causes a conflict with RG14 GCAGTCC from RG63 causes a conflict with RG15 CTCGCGC from RG64 causes a conflict with RG16 CTGCGAC from RG65 causes a conflict with RG17 ACGTATG from RG66 causes a conflict with RG18 ATACTGA from RG67 causes a conflict with RG19 Conflicts for index2: AATTCAA from RG1 causes a conflict with RG57 CGCGCAG from RG2 causes a conflict with RG58 AAGGTCT from RG3 causes a conflict with RG59 ACTGGAC from RG4 causes a conflict with RG60 AGCAGGT from RG5 causes a conflict with RG61 GTACCGG from RG6 causes a conflict with RG62 GGTCAAG from RG7 causes a conflict with RG63 AATGATG from RG8 causes a conflict with RG64 AGTCAGA from RG9 causes a conflict with RG65 AACTAGA from RG10 causes a conflict with RG66 CTATGGC from RG11 causes a conflict with RG67 AGTCAGA from RG57 causes a conflict with RG1 AACTAGA from RG58 causes a conflict with RG2 CTATGGC from RG59 causes a conflict with RG3 CGACGGT from RG60 causes a conflict with RG4 AACCAAG from RG61 causes a conflict with RG5 CGGCGTA from RG62 causes a conflict with RG6 GCAGTCC from RG63 causes a conflict with RG7 CTCGCGC from RG64 causes a conflict with RG8 CTGCGAC from RG65 causes a conflict with RG9 ACGTATG from RG66 causes a conflict with RG10 ATACTGA from RG67 causes a conflict with RG11 Conflicts for pairs: Cannot write to file testData/_RG49_r1.fail.fq.gz either you do not have permissions or you have too many read groups, in that case, convert your input data to a single BAM file and demultiplex it
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/grenaud/deML/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/ACEWo1apzBCNLKUG7HdbQ8Yd0r3RCmtYks5rKF6KgaJpZM4LSZWC .
Can you try modifying the -o option to be: -o testData/outdata
I am getting the same error with the test data. I tried using -o testData/outdata, but it returned the same error as previously. Running the script with bam input seemed to work ok.
I don't know how to read code, but the error message made me think that the program can only handle a certain number of samples when the input is in fastq format. I modified the index.txt file so that it contained a smaller number of read groups (including RG49), and that allowed the process to complete with the fastq input files.
If there any way that deML can be made to work in fastq mode with more read groups? I have a feeling that converting my fastq amplicon sequencing files into bam format might be a bit tricky/inpractical.
Thanks :)
dear klr123, sorry for the late reply. I have added an error message in the code about the number of opened file descriptors and the maximum number of file descriptors on the system. When that limit is reached, it warns the user and prints info as to how to do this.
Could you do a "ulimit -n"? It should be at least 1024.
Thank you for helping me to troubleshoot that Grenaud!
I ran ulimit -n and it returned 256. I then ran ulimit -n 1024, and after that the test data analysis ran to completion without any error messages.
Thanks for your help :)
I'm getting the exact same issue, but ulimit is 65536. Any idea what might be causing this?
really? How many read groups do you have? What is your OS?
Hey! Thanks for answering so quickly. I'm on a Debian Red Hat system. I have 287 read groups.
Ah, a further check reveals that the software I'm using was last compiled on 2016-05-25. I will reclone the repo and see if this fixes anything
OK, I've recompiled, and the error is still the same.
strange, can you do a ulimit -a?
@dangeles Did you get a chance to test this? I suspect there is a discrepancy on your system between the user limit and the system limit.
Hey, sorry, I was traveling with limited access to internet. Here's my output from ulimit -a:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 128520
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
two follow-up questions:
1) You have debian or red hat?
2) can you give me the output of:
ulimit -Sn
and ulimit -Hn
In the meantime, could you transform your fastqs to bam using https://github.com/grenaud/BCL2BAM2FASTQ and demultiplex on those?
ulimit -Sn
: 65536
and ulimit -HN
65536
can you privately send me a drop box link to your index list and files?
Also, if you want to get ahead, I would simply use my fastq2bam (https://github.com/grenaud/BCL2BAM2FASTQ/tree/master/fastq2bam), demultiplex the bam file and reconvert to fastq (https://github.com/grenaud/BCL2BAM2FASTQ/tree/master/bam2fastq)
I'll figure out how to dropbox you some data by Monday.
I used your fastq2bam tool to convert to BAM, seemingly without issues. Oddly enough, though, when I run the demultiplex command on the BAM file, it still gives me the same conflict message. Now, the BAM files are getting populated, so I know the script is running, which makes me think that the problem (though similar to the original) is in my index file.
Here's the exact error message:
Conflicts for index1:
TCGCCTTA from N701S501 causes a conflict with N701S502 N701S503 N701S504 N701S505 N701S506 N701S507 N701S508 N701S509 N701S510 N701S511 N701S512 N701S513 N701S514 N701S515 N701S516 N701S517 N701S518 N701S519 N701S520 N701S521 N701S522 N701S523 N701S524 N701S525 N701S526 N701S527 N701S528 N701S529 N701S530 N701S531 N701S532 N701S533 N701S534 N701S535 N701S536
...{many many lines later}...
Conflicts for index2:
AGTTAACA from N724S536 causes a conflict with N701S536 N702S536 N703S536 N704S536 N705S536 N706S536 N707S536 N708S536 N709S536 N710S536 N711S536 N712S536 N713S536 N714S536 N715S536 N716S536 N717S536 N718S536 N719S536 N720S536 N721S536 N722S536 N723S536
Sorry for all the hassle!
Those are not error messages but simply warnings. It means that TCGCCTTA was used by several reads groups. But I guess that is bound to happen if you have more than 200 read groups.
Let me know for the fastqs, I find it odd that with such a high limit, you get this error message.
Ah, that's good. deML seems to be running just fine on BAM. Seems like for large file numbers, maybe that will just be my default.
On Mac, seems to have compiled correctly. But get error on the test data, I don't think it is permissions issue.