aquaskyline / LRSIM

10x Genomics Reads Simulator
MIT License
45 stars 15 forks source link

Ran out of barcodes #37

Open pdimens opened 1 year ago

pdimens commented 1 year ago

I'm trying to simulate reads to test efficacy of linked reads on low-coverage datasets. This testing will occur for a range of low coverages and sample counts. However, when I try to use LRSIM, I keep getting a prompt that I've ran out of barcodes. I'm using a truncated d.melanogaster genome that's just the 4 largest chromosomes (for simplicity).

invmin=1000
invmax=10000
mollen=80
milreads=1
molper=10
prefix="sims.${milreads}mil.${molper}per"
threads=8

./LRSIM/simulateLinkedReads.pl -r dmel.trunc.fa -p $prefix -0 0 -x $milreads -f $mollen -m $molper -z $threads -o

How can I successfully produce low-coverage data without the warning that I've run out of barcodes? There is also an error of being unable to concatenate a file. LRSIM output:

Wed Mar 22 12:07:10 2023: cat sims.1mil.10per.dwgsim.0.1.12.fastq >> sims.1mil.10per.dwgsim.0.12.fastq
cat: sims.1mil.10per.dwgsim.0.1.12.fastq: No such file or directory
Wed Mar 22 12:07:10 2023: cat sims.1mil.10per.dwgsim.0.2.12.fastq >> sims.1mil.10per.dwgsim.0.12.fastq
cat: sims.1mil.10per.dwgsim.0.2.12.fastq: No such file or directory
Wed Mar 22 12:07:10 2023: cat sims.1mil.10per.dwgsim.0.3.12.fastq >> sims.1mil.10per.dwgsim.0.12.fastq
cat: sims.1mil.10per.dwgsim.0.3.12.fastq: No such file or directory
Wed Mar 22 12:07:10 2023: cat sims.1mil.10per.dwgsim.1.1.12.fastq >> sims.1mil.10per.dwgsim.1.12.fastq
cat: sims.1mil.10per.dwgsim.1.1.12.fastq: No such file or directory
Wed Mar 22 12:07:10 2023: cat sims.1mil.10per.dwgsim.1.2.12.fastq >> sims.1mil.10per.dwgsim.1.12.fastq
cat: sims.1mil.10per.dwgsim.1.2.12.fastq: No such file or directory
Wed Mar 22 12:07:10 2023: cat sims.1mil.10per.dwgsim.1.3.12.fastq >> sims.1mil.10per.dwgsim.1.12.fastq
cat: sims.1mil.10per.dwgsim.1.3.12.fastq: No such file or directory
Wed Mar 22 12:07:10 2023: Simulate reads start
Wed Mar 22 12:07:10 2023: Load barcodes start
Wed Mar 22 12:07:10 2023: Load barcodes end
Wed Mar 22 12:07:10 2023: readPairsPerMolecule: 0
Wed Mar 22 12:07:10 2023: Simulating on haplotype: 0
Wed Mar 22 12:07:10 2023: Load read positions haplotype 0
Wed Mar 22 12:07:11 2023: Importing sims.1mil.10per.0.fp
Wed Mar 22 12:07:12 2023: Imported sims.1mil.10per.0.fp
Wed Mar 22 12:07:12 2023: readsCountDown: 500000
Wed Mar 22 12:08:35 2023: Reached end of barcodes list. No more barcodes. Last read processed: 500000. Exiti
ng.
Inappropriate ioctl for device at ./LRSIM/simulateLinkedReads.pl line 748.
aquaskyline commented 1 year ago

any error messages earlier than cat: sims.1mil.10per.dwgsim.0.1.12.fastq: No such file or directory? the dwgsim doesn't seem to have ran correctly.

pdimens commented 1 year ago
Thu Mar 23 09:40:52 2023: sims.1mil.10per.status
Thu Mar 23 09:40:52 2023: Variant simulation mode enabled
Thu Mar 23 09:40:52 2023: SURVIVOR start
Thu Mar 23 09:40:52 2023: Running: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/SURVIVOR 0 dmel.trunc.fa sims.1mil.10per.hap.parameter 0 sims.1mil.10per.hap 1000
Thu Mar 23 09:44:09 2023: SURVIVOR end
Thu Mar 23 09:44:09 2023: Build genome index start
Thu Mar 23 09:44:09 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/faFilter.pl sims.1mil.10per.hap.0.fasta 0 > sims.1mil.10per.hap.0.clean.fasta
Thu Mar 23 09:44:11 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/faFilter.pl sims.1mil.10per.hap.1.fasta 0 > sims.1mil.10per.hap.1.clean.fasta
Thu Mar 23 09:44:14 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/samtools faidx sims.1mil.10per.hap.0.clean.fasta
Thu Mar 23 09:44:17 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/samtools faidx sims.1mil.10per.hap.1.clean.fasta
Thu Mar 23 09:44:21 2023: Build genome index end
Thu Mar 23 09:44:21 2023: DWGSIM round 0 thread 0 start
Thu Mar 23 09:44:21 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.0.clean.fasta sims.1mil.10per.dwgsim.0.0
[dwgsim_core] 2L length: 23751915
[dwgsim_core] 2R length: 25606425
[dwgsim_core] 3L length: 28365492
[dwgsim_core] 3R length: 32273355
[dwgsim_core] 4 length: 1430313
[dwgsim_core] X length: 23784311
[dwgsim_core] Y length: 3822390
[dwgsim_core] 7 sequences, total length: 139034201
[dwgsim_core] Currently on: 
0
[dwgsim_core] 0
[dwgsim_core] 10000
[dwgsim_core] 20000
[dwgsim_core] 30000
[dwgsim_core] 32032
[dwgsim_core] 40000
[dwgsim_core] 50000
[dwgsim_core] 60000
[dwgsim_core] 66565
[dwgsim_core] 70000Thu Mar 23 09:44:24 2023: DWGSIM round 0 thread 1 start
Thu Mar 23 09:44:24 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.0.clean.fasta sims.1mil.10per.dwgsim.0.1
[dwgsim_core] 2L length: 23751915
[dwgsim_core] 2R length: 25606425
[dwgsim_core] 3L length: 28365492
[dwgsim_core] 3R length: 32273355
[dwgsim_core] 4 length: 1430313
[dwgsim_core] X length: 23784311
[dwgsim_core] Y length: 3822390
[dwgsim_core] 7 sequences, total length: 139034201
[dwgsim_core] Currently on: 
0
[dwgsim_core] 80000
[dwgsim_core] 90000
[dwgsim_core] 100000
[dwgsim_core] 0
[dwgsim_core] 104818
[dwgsim_core] 10000
[dwgsim_core] 20000
[dwgsim_core] 30000
[dwgsim_core] 32032
[dwgsim_core] 110000
[dwgsim_core] 120000
[dwgsim_core] 130000
[dwgsim_core] 40000
[dwgsim_core] 140000Thu Mar 23 09:44:26 2023: DWGSIM round 0 thread 2 start
Thu Mar 23 09:44:26 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.0.clean.fasta sims.1mil.10per.dwgsim.0.2
[dwgsim_core] 2L length: 23751915
[dwgsim_core] 2R length: 25606425
[dwgsim_core] 3L length: 28365492
[dwgsim_core] 3R length: 32273355
[dwgsim_core] 4 length: 1430313
[dwgsim_core] X length: 23784311
[dwgsim_core] Y length: 3822390
[dwgsim_core] 7 sequences, total length: 139034201
[dwgsim_core] Currently on: 
0
[dwgsim_core] 50000
[dwgsim_core] 148341
[dwgsim_core] 150000
[dwgsim_core] 150270
[dwgsim_core] 60000
[dwgsim_core] 66565
[dwgsim_core] 0
[dwgsim_core] 10000
[dwgsim_core] 160000
[dwgsim_core] 20000
[dwgsim_core] 170000
[dwgsim_core] 70000
[dwgsim_core] 30000
[dwgsim_core] 32032
[dwgsim_core] 180000
[dwgsim_core] 80000
[dwgsim_core] 182345
[dwgsim_core] 90000
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:27 2023: DWGSIM round 0 thread 0 end

[dwgsim_core] 100000
[dwgsim_core] 104818
[dwgsim_core] 40000Thu Mar 23 09:44:28 2023: DWGSIM round 0 thread 3 start
Thu Mar 23 09:44:28 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.0.clean.fasta sims.1mil.10per.dwgsim.0.3
[dwgsim_core] 2L length: 23751915
[dwgsim_core] 2R length: 25606425
[dwgsim_core] 3L length: 28365492
[dwgsim_core] 3R length: 32273355
[dwgsim_core] 4 length: 1430313
[dwgsim_core] X length: 23784311
[dwgsim_core] Y length: 3822390
[dwgsim_core] 7 sequences, total length: 139034201
[dwgsim_core] Currently on: 
0
[dwgsim_core] 50000
[dwgsim_core] 60000
[dwgsim_core] 66565
[dwgsim_core] 110000
[dwgsim_core] 0
[dwgsim_core] 120000
[dwgsim_core] 10000
[dwgsim_core] 130000
[dwgsim_core] 20000
[dwgsim_core] 140000
[dwgsim_core] 70000
[dwgsim_core] 30000
[dwgsim_core] 32032
[dwgsim_core] 148341
[dwgsim_core] 80000
[dwgsim_core] 150000
[dwgsim_core] 150270
[dwgsim_core] 90000
[dwgsim_core] 100000
[dwgsim_core] 104818
[dwgsim_core] 40000
[dwgsim_core] 160000Thu Mar 23 09:44:30 2023: DWGSIM round 1 thread 0 start
Thu Mar 23 09:44:30 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.1.clean.fasta sims.1mil.10per.dwgsim.1.0
[dwgsim_core] 2L length: 23831808
[dwgsim_core] 2R length: 25622613
[dwgsim_core] 3L length: 28305981
[dwgsim_core] 3R length: 32300288
[dwgsim_core] 4 length: 1452932
[dwgsim_core] X length: 23686862
[dwgsim_core] Y length: 3828009
[dwgsim_core] 7 sequences, total length: 139028493
[dwgsim_core] Currently on: 
0
[dwgsim_core] 50000
[dwgsim_core] 170000
[dwgsim_core] 60000
[dwgsim_core] 180000
[dwgsim_core] 182345
[dwgsim_core] 66565
[dwgsim_core] 110000
[dwgsim_core] 0
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:30 2023: DWGSIM round 0 thread 1 end

[dwgsim_core] 120000
[dwgsim_core] 10000
[dwgsim_core] 130000
[dwgsim_core] 20000
[dwgsim_core] 140000
[dwgsim_core] 70000
[dwgsim_core] 30000
[dwgsim_core] 32141
[dwgsim_core] 148341
[dwgsim_core] 150000
[dwgsim_core] 80000
[dwgsim_core] 150270
[dwgsim_core] 90000
[dwgsim_core] 100000
[dwgsim_core] 104818
[dwgsim_core] 40000
[dwgsim_core] 160000
[dwgsim_core] 50000
[dwgsim_core] 170000
[dwgsim_core] 60000
[dwgsim_core] 180000
[dwgsim_core] 66697
[dwgsim_core] 182345
[dwgsim_core] 110000
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:32 2023: DWGSIM round 0 thread 2 end

[dwgsim_core] 120000
[dwgsim_core] 130000Thu Mar 23 09:44:33 2023: DWGSIM round 1 thread 1 start
Thu Mar 23 09:44:33 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.1.clean.fasta sims.1mil.10per.dwgsim.1.1
[dwgsim_core] 2L length: 23831808
[dwgsim_core] 2R length: 25622613
[dwgsim_core] 3L length: 28305981
[dwgsim_core] 3R length: 32300288
[dwgsim_core] 4 length: 1452932
[dwgsim_core] X length: 23686862
[dwgsim_core] Y length: 3828009
[dwgsim_core] 7 sequences, total length: 139028493
[dwgsim_core] Currently on: 
0
[dwgsim_core] 70000
[dwgsim_core] 140000
[dwgsim_core] 80000
[dwgsim_core] 148341
[dwgsim_core] 150000
[dwgsim_core] 150270
[dwgsim_core] 90000
[dwgsim_core] 0
[dwgsim_core] 100000
[dwgsim_core] 10000
[dwgsim_core] 104872
[dwgsim_core] 20000
[dwgsim_core] 160000
[dwgsim_core] 30000
[dwgsim_core] 32141
[dwgsim_core] 170000
[dwgsim_core] 180000
[dwgsim_core] 182345
[dwgsim_core] 110000
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:34 2023: DWGSIM round 0 thread 3 end

[dwgsim_core] 120000
[dwgsim_core] 40000
[dwgsim_core] 130000
[dwgsim_core] 50000Thu Mar 23 09:44:35 2023: DWGSIM round 1 thread 2 start
Thu Mar 23 09:44:35 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.1.clean.fasta sims.1mil.10per.dwgsim.1.2
[dwgsim_core] 2L length: 23831808
[dwgsim_core] 2R length: 25622613
[dwgsim_core] 3L length: 28305981
[dwgsim_core] 3R length: 32300288
[dwgsim_core] 4 length: 1452932
[dwgsim_core] X length: 23686862
[dwgsim_core] Y length: 3828009
[dwgsim_core] 7 sequences, total length: 139028493
[dwgsim_core] Currently on: 
0
[dwgsim_core] 140000
[dwgsim_core] 60000
[dwgsim_core] 148434
[dwgsim_core] 150000
[dwgsim_core] 150393
[dwgsim_core] 66697
[dwgsim_core] 0
[dwgsim_core] 10000
[dwgsim_core] 20000
[dwgsim_core] 70000
[dwgsim_core] 160000
[dwgsim_core] 30000
[dwgsim_core] 32141
[dwgsim_core] 80000
[dwgsim_core] 170000
[dwgsim_core] 90000
[dwgsim_core] 180000
[dwgsim_core] 182338
[dwgsim_core] 100000
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:36 2023: DWGSIM round 1 thread 0 end

[dwgsim_core] 104872
[dwgsim_core] 40000
[dwgsim_core] 50000Thu Mar 23 09:44:37 2023: DWGSIM round 1 thread 3 start
Thu Mar 23 09:44:37 2023: /local/workdir/pdimens/dmelanogaster/sims/LRSIM/dwgsim -N 187500 -e 0.0001,0.0016 -E 0.0001,0.0016 -d 350 -s 35 -1 135 -2 151 -H -y 0 -S 0 -c 0 -m /dev/null sims.1mil.10per.hap.1.clean.fasta sims.1mil.10per.dwgsim.1.3
[dwgsim_core] 2L length: 23831808
[dwgsim_core] 2R length: 25622613
[dwgsim_core] 3L length: 28305981
[dwgsim_core] 3R length: 32300288
[dwgsim_core] 4 length: 1452932
[dwgsim_core] X length: 23686862
[dwgsim_core] Y length: 3828009
[dwgsim_core] 7 sequences, total length: 139028493
[dwgsim_core] Currently on: 
0
[dwgsim_core] 60000
[dwgsim_core] 66697
[dwgsim_core] 110000
[dwgsim_core] 0
[dwgsim_core] 120000
[dwgsim_core] 10000
[dwgsim_core] 130000
[dwgsim_core] 20000
[dwgsim_core] 140000
[dwgsim_core] 70000
[dwgsim_core] 148434
[dwgsim_core] 30000
[dwgsim_core] 32141
[dwgsim_core] 150000
[dwgsim_core] 150393
[dwgsim_core] 80000
[dwgsim_core] 90000
[dwgsim_core] 100000
[dwgsim_core] 104872
[dwgsim_core] 160000
[dwgsim_core] 40000
[dwgsim_core] 170000
[dwgsim_core] 50000
[dwgsim_core] 180000
[dwgsim_core] 60000
[dwgsim_core] 182338
[dwgsim_core] 66697
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:39 2023: DWGSIM round 1 thread 1 end

[dwgsim_core] 110000
[dwgsim_core] 120000
[dwgsim_core] 130000
[dwgsim_core] 140000
[dwgsim_core] 70000
[dwgsim_core] 148434
[dwgsim_core] 150000
[dwgsim_core] 150393
[dwgsim_core] 80000
[dwgsim_core] 90000
[dwgsim_core] 100000
[dwgsim_core] 104872
[dwgsim_core] 160000
[dwgsim_core] 170000
[dwgsim_core] 180000Thu Mar 23 09:44:41 2023: cat sims.1mil.10per.dwgsim.0.1.12.fastq >> sims.1mil.10per.dwgsim.0.12.fastq

[dwgsim_core] 182338Thu Mar 23 09:44:41 2023: cat sims.1mil.10per.dwgsim.0.2.12.fastq >> sims.1mil.10per.dwgsim.0.12.fastq
Thu Mar 23 09:44:41 2023: cat sims.1mil.10per.dwgsim.0.3.12.fastq >> sims.1mil.10per.dwgsim.0.12.fastq

[dwgsim_core] 110000
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:41 2023: DWGSIM round 1 thread 2 end
Thu Mar 23 09:44:41 2023: cat sims.1mil.10per.dwgsim.1.1.12.fastq >> sims.1mil.10per.dwgsim.1.12.fastq

[dwgsim_core] 120000Thu Mar 23 09:44:41 2023: cat sims.1mil.10per.dwgsim.1.2.12.fastq >> sims.1mil.10per.dwgsim.1.12.fastq

[dwgsim_core] 130000
[dwgsim_core] 140000
[dwgsim_core] 148434
[dwgsim_core] 150000
[dwgsim_core] 150393
[dwgsim_core] 160000
[dwgsim_core] 170000
[dwgsim_core] 180000
[dwgsim_core] 182338
[dwgsim_core] 187500
[dwgsim_core] Complete!
Thu Mar 23 09:44:43 2023: DWGSIM round 1 thread 3 end
Thu Mar 23 09:44:43 2023: cat sims.1mil.10per.dwgsim.1.3.12.fastq >> sims.1mil.10per.dwgsim.1.12.fastq
Thu Mar 23 09:44:43 2023: Simulate reads start
Thu Mar 23 09:44:43 2023: Load barcodes start
Thu Mar 23 09:44:43 2023: Load barcodes end
Thu Mar 23 09:44:43 2023: readPairsPerMolecule: 0
Thu Mar 23 09:44:43 2023: Simulating on haplotype: 0
Thu Mar 23 09:44:43 2023: Load read positions haplotype 0
Thu Mar 23 09:44:46 2023: 0 reads failed being loaded.
Thu Mar 23 09:44:46 2023: Exporting sims.1mil.10per.0.fp
Thu Mar 23 09:44:47 2023: Exported sims.1mil.10per.0.fp
Thu Mar 23 09:44:47 2023: readsCountDown: 500000
Thu Mar 23 09:46:04 2023: Reached end of barcodes list. No more barcodes. Last read processed: 500000. Exiting.
Inappropriate ioctl for device at ./LRSIM/simulateLinkedReads.pl line 748.
pdimens commented 9 months ago

@aquaskyline does LRSIM recycle barcodes? If it doesn't, then that would explain a lot. Actual linked-read experiments have recurring barcodes that need to be identified as belonging to disparate origin molecules as part of a workflow. It would make sense if LRSIM mimicked this behavior as well, be it by default or opt-in.