egaffo / circompara2

Improved bioinformatic pipeline to identify and quantify circRNA expression from RNA-seq data by combining multiple circRNA detection methods
Other
8 stars 0 forks source link

TypeError: cannot concatenate 'str' and 'int' objects: #11

Open pecoraro90 opened 1 year ago

pecoraro90 commented 1 year ago

Hi egaffo, thanks for the prompt reply. I tried the new docker but I keep having a similar error:

user@NGS:~/CirComPara$ sudo docker run -u id -u --rm -it -v $(pwd):/data egaffo/circompara2:v0.1.2.1 scons: Reading SConscript files ... TypeError: cannot concatenate 'str' and 'int' objects: File "/circompara2/src/sconstructs/main.py", line 489: exports = '''env_check_indexes''') File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660: return method(*args, kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597: return _SConscript(self.fs, *files, *subst_kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286: exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals) File "/circompara2/src/sconstructs/check_indexes.py", line 118: exports = '''env_build_indexes ''') File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660: return method(args, kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597: return _SConscript(self.fs, *files, subst_kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286: exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals) File "/circompara2/src/sconstructs/build_indexes.py", line 67: exports = '''env_index_hisat2 ''') File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660: return method(*args, *kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597: return _SConscript(self.fs, files, subst_kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286: exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals) File "/circompara2/src/sconstructs/index_hisat2.py", line 44: ''' ${TARGETS[0].dir}''' + os.path.sep + target_basename + ''' ''' + EXTRA_PARAMS

egaffo commented 1 year ago

Please, paste your vars.py file. I suspect you misspelt some variables, using numbers instead of literals...Perhaps the GENOME_INDEX variable (i.e. the genome index for HISAT2).

pecoraro90 commented 1 year ago

Hi egaffo, this is my var file:

META = 'meta.csv' GENOME_FASTA = '/data/Homo_sapiens.GRCh38.dna.primary_assembly.fa' ANNOTATION = '/data/Homo_sapiens.GRCh38.104.gtf' CPUS = 8

pecoraro90 commented 1 year ago

also tried

META = 'data/meta.csv' GENOME_FASTA = '/data/Homo_sapiens.GRCh38.dna.primary_assembly.fa' ANNOTATION = '/data/Homo_sapiens.GRCh38.104.gtf' CPUS = 8

still same error

egaffo commented 1 year ago

use CPUS = '8' instead of CPUS = 8

egaffo commented 1 year ago

One more thing, not related to your error: you do not need to run the docker command with sudo. Just add your user to the group docker.

pecoraro90 commented 1 year ago

Hi egaffo,

I tried

CPUS = '8' instead of CPUS = 8

but I still got the same error:

user@NGS:~/CirComPara/data$ sudo docker run -u id -u --rm -it -v /home/user/CirComPara/:/data egaffo/circompara2:v0.1.2.1 scons: Reading SConscript files ... TypeError: cannot concatenate 'str' and 'int' objects: File "/circompara2/src/sconstructs/main.py", line 489: exports = '''env_check_indexes''') File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660: return method(*args, kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597: return _SConscript(self.fs, *files, *subst_kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286: exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals) File "/circompara2/src/sconstructs/check_indexes.py", line 118: exports = '''env_build_indexes ''') File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660: return method(args, kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597: return _SConscript(self.fs, *files, subst_kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286: exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals) File "/circompara2/src/sconstructs/build_indexes.py", line 67: exports = '''env_index_hisat2 ''') File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 660: return method(*args, *kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 597: return _SConscript(self.fs, files, subst_kw) File "/circompara2/tools/scons/scons-local-3.1.2/SCons/Script/SConscript.py", line 286: exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals) File "/circompara2/src/sconstructs/index_hisat2.py", line 44: ''' ${TARGETS[0].dir}''' + os.path.sep + target_basename + ''' ''' + EXTRA_PARAMS

egaffo commented 1 year ago

Try deleting the hidden file .sconsign.dblite, as it could be messed up from previous runs. I was able to reproduce your error with the minimal vars.py file as yours and solved the issue using CPUS = '8' (a string instead of a integer).

Assuming all files (meta.csv, vars.py, genome FASTA, annotation GTF, and input FASTQ) are in the same directory you are into, with the vars.py as following

META = 'meta.csv'
GENOME_FASTA = 'Homo_sapiens.GRCh38.dna.primary_assembly.fa'
ANNOTATION = 'Homo_sapiens.GRCh38.104.gtf'
CPUS = '8'

you should have no errors. Do mind the path of your current directory, I see you were into ~/CirComPara/data, but loaded the volume /home/user/CirComPara/ in the docker command.

pecoraro90 commented 1 year ago

Hi egaffo, I actually did not see any dot file, but by mistake I deleted the whole working directory (ouch). However, when I create ex novo another folder with all the files it started to work properly. Anyway, after few hours of run it returned this error:

gzip: stdout: Broken pipe /bin/bash: line 1: 619 Segmentation fault (core dumped) STAR --outSAMstrandField intronMotif --outSAMtype BAM SortedByCoordinate --runRNGseed 123 --outSJfilterOverhangMin 15 5 15 --alignSJoverhangMin 15 --alignSJDBoverhangMin 15 --seedSearchStartLmax 30 --outFilterScoreMin 1 --outFilterMatchNmin 1 --outFilterMismatchNmax 2 --chimSegmentMin 15 --chimScor 15 --chimScoreSeparation 10 --chimJunctionOverhangMin 15 --chimOutType Junctions --runThreadN 8 --genomeDir /data/dbs/indexes/indexes/star/Homo_sapiens.GRCh38.dna.primary_assembly adFilesIn "/data/samples/S1/processings/circRNAs/S1.unmappedSE.fq.gz" --readFilesCommand zcat --sjdbGTFfile /data/Homo_sapiens.GRCh38.104.gtf scons: *** [samples/S1/processings/circRNAs/star_out/Aligned.sortedByCoord.out.bam] Error 139 scons: building terminated because of errors.

I guess it's a memory problem, isn't it? How many cpu do you usually use? I have 8 CPU and 32 GB RAM

egaffo commented 1 year ago

That is probably due to a lack of "execute" permissions on the working directory because STAR generates an executable command to decompress input files. It happened to me when working into an NFS partition where I did not set the execute permissions to the users. You can check if this is your case by simply making an executable bash script in the dir and launching it (F.i. echo -e '#!/bin/bash\necho Ciao' > mockcmd.sh; chmod +x mockcmd.sh; ./mockcmd.sh). If you do not have the rights to change the partition/directory permissions, a workaround is telling STAR to use another temp dir. Take a look at issue #4 (here) for further details.

Regarding your hardware, you may run out of memory because one of the chimeric aligners (Segemehl) should need ~60GB RAM to load the human genome index...also STAR is quite demanding (as you already seem to be aware), but 32GB should be the least amount of RAM to load the human genome index. Depending on your system, once a program asks for more memory than is available, the OS will use swap partition/like, which will slow down computation. I suggest you monitor the usage of your system resources while running CirComPara2. If you find out that your machine does not support STAR and/or Segemehl indexes for the human genome, do consider dropping circRNA detection methods based on Segemehl (i.e. circexplorer2_segemehl, and testrealign; even though testrealign is not run by default) or STAR (i.e. dcc, circexplorer2_star, and circrna_finder; again, circrna_finder is not used by default). You'll still be able to use the other tools, which are less demanding of RAM, by setting, into the vars.py file,

CIRCRNA_METHODS = 'circexplorer2_bwa,circexplorer2_tophat,ciri,findcirc'

that exploits only TopHat2, BWA, and Bowtie2, with 6GB max RAM used (by BWA). With such a setting, the circRNA detection recall will be lower, but still is the best you can do with that machine. In that case, you can also consider using -j2 option of CirComPara2 to make it run two tasks in parallel (mind to set CPUS='4'!), so the peak RAM used will be 6x2=12GB. It will likely speed up the whole computation.

pecoraro90 commented 1 year ago

Hi egaffo, I run the test script in the same working directory and it worked. echo -e '#!/bin/bash\necho Ciao' > mockcmd.sh chmod +x mockcmd.sh ./mockcmd.sh Ciao I will try excluding STAR from the pipeline. I can also try to ask for more RAM on the machine. I will let you know

pecoraro90 commented 1 year ago

Hi Enrico, I tried running circompara2 excluding STAR and Segemehl with the option

CIRCRNA_METHODS = 'circexplorer2_bwa,circexplorer2_tophat,ciri,findcirc'

It worked a bit...then it returned this error. I am pasting only the last lines because the whole log is really long:

SAM was divided successfully. First read of divided SAM files: S3_bwa.sam.tempab: SRR7120617.10185748 S3_bwa.sam.tempac: SRR7120618.10136952 S3_bwa.sam.tempad: SRR7120619.10506075 S3_bwa.sam.tempaa: SRR7120617.2 First reads were recorded successfully. [Thu Sep 15 18:01:17 2022] First scanning Worker 1 begins to scan S3_bwa.sam.tempaa. Worker 2 begins to scan S3_bwa.sam.tempad. Worker 3 begins to scan S3_bwa.sam.tempac. Worker 4 begins to scan S3_bwa.sam.tempab. Worker 1 finished reporting. Worker 2 finished reporting. left reads: min. length=35, max. length=76, 6564177 kept reads (170865 discarded) [2022-09-15 18:02:03] Building transcriptome data files samples/S3/processings/circRNAs/tophat_out/tmp/Homo_sapiens.GRCh38.104 Worker 3 finished reporting. Worker 4 finished reporting. Candidate reads with splicing signals: 12418 Candidate reads with PEM signals: 12121 Candidate circRNAs found: 6914 [Thu Sep 15 18:02:13 2022] Second scanning Worker 5 begins to scan S3_bwa.sam.tempaa. Worker 6 begins to scan S3_bwa.sam.tempad. Worker 7 begins to scan S3_bwa.sam.tempac. Worker 8 begins to scan S3_bwa.sam.tempab. [FAILED] Error: gtf_to_fasta returned an error. scons: *** [samples/S3/processings/circRNAs/tophat_out/accepted_hits.bam] Error 1 Worker 5 finished reporting. Worker 6 finished reporting. Worker 7 finished reporting. Worker 8 finished reporting. [Thu Sep 15 18:03:30 2022] Extracting info from temporary files Additional candidate reads found: 2268 Additional candidate reads with PEM signals: 2186 [Thu Sep 15 18:03:32 2022] Summarizing Number of circular RNAs found: 1873 [Thu Sep 15 18:03:35 2022] CIRI finished its work. Please see output file S3_ciri.out for detail. scons: building terminated because of errors.

The file CIRIerror.log of the S3 sample is empty.

Any suggestions?

pecoraro90 commented 1 year ago

I am pasting the whole log just in case you want to check:

(base) lorenabuono@NGS:~/CirComPara$ sudo docker run -u id -u --rm -it -v /home/lorenabuono/CirComPara/:/data egaffo/circompara2:v0.1.2.1 -j2 scons: Reading SConscript files ... No read preprocessing specified No read preprocessing specified No read preprocessing specified No read preprocessing specified scons: done reading SConscript files. scons: Building targets ... hisat2-build -f --seed 1 -p 4 /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly SymLink(["dbs/indexes/indexes/bowtie2/Homo_sapiens.GRCh38.dna.primary_assembly.fa"], ["Homo_sapiens.GRCh38.dna.primary_assembly.fa"]) Settings: Output files: "dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly..ht2" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Local offset rate: 3 (one in 8) Local fTable chars: 6 Local sequence length: 57344 Local sequence overlap between two consecutive indexes: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 1 Sizeofs: void:8, int:4, long:8, size_t:8 Input files DNA, FASTA: /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa Reading reference sizes bowtie2-build -f --seed 1 --threads 4 /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa dbs/indexes/indexes/bowtie2/Homo_sapiens.GRCh38.dna.primary_assembly Settings: Output files: "dbs/indexes/indexes/bowtie2/Homo_sapiens.GRCh38.dna.primary_assembly..bt2" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 16 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 1 Sizeofs: void:8, int:4, long:8, size_t:8 Input files DNA, FASTA: /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa Building a SMALL index Reading reference sizes Time reading reference sizes: 00:00:25 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time reading reference sizes: 00:00:28 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:16 Time to read SNPs and splice sites: 00:00:00 Using parameters --bmax 521303816 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 521303816 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples Time to join reference sequences: 00:00:19 bmax according to bmaxDivN setting: 173767938 Using parameters --bmax 130325954 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 130325954 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:57 Allocating rank array Ranking v-sort output V-Sorting samples time: 00:00:56 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:35 Invoking Larsson-Sadakane on ranks Ranking v-sort output time: 00:00:33 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:53 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Invoking Larsson-Sadakane on ranks time: 00:00:54 Sanity-checking and returning Building samples Reserving space for 44 sample suffixes Generating random suffixes QSorting 44 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 44 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 6; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 3, merged 18; iterating... Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 3.47536e+08 (target: 521303815) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering GFM loop Getting block 1 of 8 Reserving size (521303816) for bucket 1 Getting block 2 of 8 Reserving size (521303816) for bucket 2 Calculating Z arrays for bucket 1 Calculating Z arrays for bucket 2 Entering block accumulator loop for bucket 1: Entering block accumulator loop for bucket 2: Getting block 3 of 8 Getting block 4 of 8 Reserving size (521303816) for bucket 4 Reserving size (521303816) for bucket 3 Calculating Z arrays for bucket 4 Calculating Z arrays for bucket 3 Entering block accumulator loop for bucket 4: Entering block accumulator loop for bucket 3: bucket 1: 10% bucket 2: 10% bucket 4: 10% bucket 1: 20% bucket 3: 10% bucket 2: 20% bucket 1: 30% bucket 4: 20% bucket 1: 40% bucket 2: 30% bucket 3: 20% Splitting and merging Splitting and merging time: 00:00:00 Split 3, merged 2; iterating... bucket 1: 50% bucket 4: 30% bucket 2: 40% bucket 1: 60% bucket 3: 30% bucket 2: 50% bucket 4: 40% bucket 1: 70% bucket 2: 60% bucket 1: 80% bucket 3: 40% bucket 4: 50% bucket 1: 90% bucket 2: 70% bucket 1: 100% Sorting block of length 390727629 for bucket 1 (Using difference cover) bucket 4: 60% bucket 2: 80% bucket 3: 50% bucket 2: 90% bucket 4: 70% bucket 3: 60% bucket 2: 100% Sorting block of length 287025447 for bucket 2 (Using difference cover) Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 3; iterating... bucket 3: 70% bucket 4: 80% bucket 3: 80% bucket 4: 90% bucket 3: 90% bucket 4: 100% Sorting block of length 326786289 for bucket 4 (Using difference cover) bucket 3: 100% Sorting block of length 388741822 for bucket 3 (Using difference cover) Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 9.5872e+07 (target: 130325953) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 29 Reserving size (130325954) for bucket 1 Getting block 2 of 29 Getting block 3 of 29 Getting block 4 of 29 Calculating Z arrays for bucket 1 Reserving size (130325954) for bucket 2 Reserving size (130325954) for bucket 3 Reserving size (130325954) for bucket 4 Entering block accumulator loop for bucket 1: Calculating Z arrays for bucket 2 Calculating Z arrays for bucket 3 Calculating Z arrays for bucket 4 Entering block accumulator loop for bucket 3: Entering block accumulator loop for bucket 2: Entering block accumulator loop for bucket 4: bucket 1: 10% bucket 2: 10% bucket 3: 10% bucket 4: 10% bucket 1: 20% bucket 2: 20% bucket 3: 20% bucket 4: 20% bucket 1: 30% bucket 2: 30% bucket 3: 30% bucket 4: 30% bucket 1: 40% bucket 2: 40% bucket 3: 40% bucket 1: 50% bucket 4: 40% bucket 2: 50% bucket 1: 60% bucket 3: 50% bucket 4: 50% bucket 2: 60% bucket 1: 70% bucket 3: 60% bucket 4: 60% bucket 2: 70% bucket 1: 80% bucket 3: 70% bucket 4: 70% bucket 2: 80% bucket 1: 90% bucket 3: 80% bucket 2: 90% bucket 4: 80% bucket 1: 100% Sorting block of length 122116269 for bucket 1 (Using difference cover) bucket 3: 90% bucket 2: 100% Sorting block of length 120727442 for bucket 2 (Using difference cover) bucket 4: 90% bucket 3: 100% Sorting block of length 112467957 for bucket 3 (Using difference cover) bucket 4: 100% Sorting block of length 127835020 for bucket 4 (Using difference cover) Sorting block time: 00:02:34 Returning block of 122116270 for bucket 1 Getting block 5 of 29 Reserving size (130325954) for bucket 5 Calculating Z arrays for bucket 5 Entering block accumulator loop for bucket 5: Sorting block time: 00:02:32 Returning block of 112467958 for bucket 3 bucket 5: 10% Getting block 6 of 29 Reserving size (130325954) for bucket 6 Calculating Z arrays for bucket 6 Entering block accumulator loop for bucket 6: Sorting block time: 00:02:41 Returning block of 120727443 for bucket 2 Getting block 7 of 29 Reserving size (130325954) for bucket 7 Calculating Z arrays for bucket 7 Entering block accumulator loop for bucket 7: bucket 5: 20% bucket 6: 10% bucket 5: 30% bucket 7: 10% bucket 6: 20% bucket 5: 40% bucket 7: 20% bucket 6: 30% bucket 5: 50% bucket 7: 30% bucket 6: 40% Sorting block time: 00:03:03 Returning block of 127835021 for bucket 4 bucket 5: 60% bucket 6: 50% bucket 7: 40% Getting block 8 of 29 Reserving size (130325954) for bucket 8 Calculating Z arrays for bucket 8 Entering block accumulator loop for bucket 8: bucket 5: 70% bucket 6: 60% bucket 7: 50% bucket 8: 10% bucket 5: 80% bucket 6: 70% bucket 7: 60% bucket 8: 20% bucket 5: 90% bucket 6: 80% bucket 7: 70% bucket 8: 30% bucket 5: 100% Sorting block of length 68001757 for bucket 5 (Using difference cover) bucket 6: 90% bucket 7: 80% bucket 8: 40% bucket 6: 100% Sorting block of length 126604627 for bucket 6 (Using difference cover) bucket 7: 90% bucket 8: 50% bucket 7: 100% Sorting block of length 82490839 for bucket 7 (Using difference cover) bucket 8: 60% bucket 8: 70% bucket 8: 80% bucket 8: 90% bucket 8: 100% Sorting block of length 116591394 for bucket 8 (Using difference cover) Sorting block time: 00:01:48 Returning block of 68001758 for bucket 5 Getting block 9 of 29 Reserving size (130325954) for bucket 9 Calculating Z arrays for bucket 9 Entering block accumulator loop for bucket 9: bucket 9: 10% Sorting block time: 00:06:59 Returning block of 287025448 for bucket 2 bucket 9: 20% Getting block 5 of 8 Reserving size (521303816) for bucket 5 Calculating Z arrays for bucket 5 Entering block accumulator loop for bucket 5: bucket 9: 30% bucket 5: 10% bucket 9: 40% bucket 5: 20% Sorting block time: 00:02:04 Returning block of 82490840 for bucket 7 bucket 9: 50% Getting block 10 of 29 Reserving size (130325954) for bucket 10 Calculating Z arrays for bucket 10 Entering block accumulator loop for bucket 10: bucket 5: 30% bucket 9: 60% bucket 5: 40% bucket 10: 10% bucket 9: 70% bucket 5: 50% bucket 10: 20% bucket 5: 60% bucket 9: 80% bucket 10: 30% bucket 5: 70% bucket 9: 90% bucket 10: 40% bucket 5: 80% bucket 9: 100% Sorting block of length 19754611 for bucket 9 (Using difference cover) bucket 10: 50% bucket 5: 90% bucket 5: 100% Sorting block of length 454119924 for bucket 5 (Using difference cover) bucket 10: 60% bucket 10: 70% bucket 10: 80% bucket 10: 90% Sorting block time: 00:03:06 Returning block of 126604628 for bucket 6 Getting block 11 of 29 Reserving size (130325954) for bucket 11 Calculating Z arrays for bucket 11 Entering block accumulator loop for bucket 11: bucket 10: 100% Sorting block of length 124947236 for bucket 10 (Using difference cover) Sorting block time: 00:00:30 Returning block of 19754612 for bucket 9 Getting block 12 of 29 Reserving size (130325954) for bucket 12 Calculating Z arrays for bucket 12 Entering block accumulator loop for bucket 12: bucket 11: 10% bucket 12: 10% bucket 11: 20% bucket 12: 20% bucket 11: 30% Sorting block time: 00:02:54 Returning block of 116591395 for bucket 8 bucket 12: 30% Getting block 13 of 29 Reserving size (130325954) for bucket 13 Calculating Z arrays for bucket 13 Entering block accumulator loop for bucket 13: bucket 11: 40% bucket 12: 40% bucket 13: 10% bucket 11: 50% bucket 12: 50% Sorting block time: 00:08:45 Returning block of 326786290 for bucket 4 bucket 13: 20% bucket 11: 60% bucket 12: 60% bucket 13: 30% bucket 11: 70% Getting block 6 of 8 Reserving size (521303816) for bucket 6 Calculating Z arrays for bucket 6 Entering block accumulator loop for bucket 6: bucket 12: 70% Sorting block time: 00:09:25 Returning block of 390727630 for bucket 1 bucket 11: 80% bucket 13: 40% bucket 6: 10% bucket 12: 80% bucket 6: 20% bucket 11: 90% bucket 13: 50% bucket 12: 90% bucket 6: 30% Getting block 7 of 8 Reserving size (521303816) for bucket 7 Calculating Z arrays for bucket 7 Entering block accumulator loop for bucket 7: bucket 11: 100% Sorting block of length 103509098 for bucket 11 (Using difference cover) bucket 13: 60% bucket 7: 10% bucket 6: 40% bucket 12: 100% Sorting block of length 77487613 for bucket 12 (Using difference cover) bucket 13: 70% bucket 7: 20% bucket 6: 50% bucket 7: 30% bucket 13: 80% bucket 6: 60% bucket 7: 40% bucket 13: 90% bucket 6: 70% bucket 7: 50% bucket 13: 100% Sorting block of length 88785625 for bucket 13 (Using difference cover) bucket 7: 60% bucket 6: 80% bucket 7: 70% bucket 6: 90% bucket 7: 80% bucket 7: 90% bucket 6: 100% Sorting block of length 197542472 for bucket 6 (Using difference cover) bucket 7: 100% Sorting block of length 365252593 for bucket 7 (Using difference cover) Sorting block time: 00:10:10 Returning block of 388741823 for bucket 3 Getting block 8 of 8 Reserving size (521303816) for bucket 8 Calculating Z arrays for bucket 8 Entering block accumulator loop for bucket 8: bucket 8: 10% bucket 8: 20% bucket 8: 30% bucket 8: 40% bucket 8: 50% bucket 8: 60% bucket 8: 70% bucket 8: 80% bucket 8: 90% bucket 8: 100% Sorting block of length 370090833 for bucket 8 (Using difference cover) Sorting block time: 00:02:11 Returning block of 77487614 for bucket 12 Getting block 14 of 29 Reserving size (130325954) for bucket 14 Calculating Z arrays for bucket 14 Entering block accumulator loop for bucket 14: Sorting block time: 00:03:30 Returning block of 124947237 for bucket 10 Getting block 15 of 29 Reserving size (130325954) for bucket 15 Calculating Z arrays for bucket 15 Entering block accumulator loop for bucket 15: bucket 14: 10% bucket 15: 10% bucket 15: 20% bucket 14: 20% bucket 15: 30% Sorting block time: 00:02:50 Returning block of 103509099 for bucket 11 bucket 14: 30% Getting block 16 of 29 Reserving size (130325954) for bucket 16 Calculating Z arrays for bucket 16 Entering block accumulator loop for bucket 16: Sorting block time: 00:02:29 Returning block of 88785626 for bucket 13 bucket 15: 40% Getting block 17 of 29 Reserving size (130325954) for bucket 17 Calculating Z arrays for bucket 17 Entering block accumulator loop for bucket 17: bucket 16: 10% bucket 14: 40% bucket 15: 50% bucket 17: 10% bucket 16: 20% bucket 15: 60% bucket 14: 50% bucket 17: 20% bucket 16: 30% bucket 14: 60% bucket 15: 70% bucket 17: 30% bucket 16: 40% bucket 14: 70% bucket 15: 80% bucket 17: 40% bucket 16: 50% bucket 15: 90% bucket 14: 80% bucket 17: 50% bucket 16: 60% bucket 15: 100% Sorting block of length 122354008 for bucket 15 (Using difference cover) bucket 14: 90% bucket 16: 70% bucket 17: 60% bucket 14: 100% Sorting block of length 101961689 for bucket 14 (Using difference cover) bucket 16: 80% bucket 17: 70% bucket 16: 90% bucket 17: 80% bucket 16: 100% Sorting block of length 80654302 for bucket 16 (Using difference cover) bucket 17: 90% bucket 17: 100% Sorting block of length 120439801 for bucket 17 (Using difference cover) Sorting block time: 00:02:15 Returning block of 80654303 for bucket 16 Getting block 18 of 29 Reserving size (130325954) for bucket 18 Calculating Z arrays for bucket 18 Entering block accumulator loop for bucket 18: Sorting block time: 00:05:49 Returning block of 197542473 for bucket 6 bucket 18: 10% Sorting block time: 00:02:45 Returning block of 101961690 for bucket 14 bucket 18: 20% Getting block 19 of 29 Reserving size (130325954) for bucket 19 Calculating Z arrays for bucket 19 Entering block accumulator loop for bucket 19: bucket 18: 30% bucket 19: 10% bucket 18: 40% bucket 19: 20% bucket 18: 50% bucket 19: 30% bucket 18: 60% Sorting block time: 00:03:25 Returning block of 122354009 for bucket 15 bucket 19: 40% bucket 18: 70% Getting block 20 of 29 Reserving size (130325954) for bucket 20 Calculating Z arrays for bucket 20 Entering block accumulator loop for bucket 20: bucket 19: 50% bucket 18: 80% bucket 20: 10% bucket 19: 60% bucket 18: 90% bucket 20: 20% bucket 19: 70% bucket 18: 100% Sorting block of length 37983759 for bucket 18 (Using difference cover) bucket 20: 30% bucket 19: 80% bucket 20: 40% bucket 19: 90% Sorting block time: 00:03:34 Returning block of 120439802 for bucket 17 bucket 19: 100% Sorting block of length 92688050 for bucket 19 (Using difference cover) bucket 20: 50% Getting block 21 of 29 Reserving size (130325954) for bucket 21 Calculating Z arrays for bucket 21 Entering block accumulator loop for bucket 21: bucket 20: 60% bucket 21: 10% bucket 20: 70% bucket 21: 20% bucket 20: 80% bucket 21: 30% bucket 20: 90% bucket 21: 40% bucket 20: 100% Sorting block of length 95082895 for bucket 20 (Using difference cover) Sorting block time: 00:01:04 Returning block of 37983760 for bucket 18 bucket 21: 50% Getting block 22 of 29 Reserving size (130325954) for bucket 22 Calculating Z arrays for bucket 22 Entering block accumulator loop for bucket 22: bucket 22: 10% bucket 21: 60% bucket 22: 20% bucket 21: 70% bucket 22: 30% bucket 21: 80% bucket 22: 40% bucket 21: 90% bucket 22: 50% bucket 21: 100% Sorting block of length 85843609 for bucket 21 (Using difference cover) bucket 22: 60% bucket 22: 70% bucket 22: 80% bucket 22: 90% bucket 22: 100% Sorting block of length 111721859 for bucket 22 (Using difference cover) Sorting block time: 00:02:24 Returning block of 92688051 for bucket 19 Getting block 23 of 29 Reserving size (130325954) for bucket 23 Calculating Z arrays for bucket 23 Entering block accumulator loop for bucket 23: bucket 23: 10% bucket 23: 20% bucket 23: 30% bucket 23: 40% bucket 23: 50% Sorting block time: 00:02:23 Returning block of 95082896 for bucket 20 bucket 23: 60% Getting block 24 of 29 Reserving size (130325954) for bucket 24 Calculating Z arrays for bucket 24 Entering block accumulator loop for bucket 24: Sorting block time: 00:10:15 Returning block of 365252594 for bucket 7 bucket 23: 70% bucket 24: 10% bucket 23: 80% bucket 24: 20% Sorting block time: 00:01:52 Returning block of 85843610 for bucket 21 Getting block 25 of 29 Reserving size (130325954) for bucket 25 Calculating Z arrays for bucket 25 Entering block accumulator loop for bucket 25: bucket 23: 90% bucket 24: 30% bucket 25: 10% bucket 23: 100% Sorting block of length 68822364 for bucket 23 (Using difference cover) Sorting block time: 00:13:01 Returning block of 454119925 for bucket 5 bucket 24: 40% bucket 25: 20% bucket 24: 50% bucket 25: 30% bucket 24: 60% bucket 25: 40% bucket 24: 70% bucket 25: 50% Sorting block time: 00:10:09 Returning block of 370090834 for bucket 8 bucket 24: 80% bucket 25: 60% bucket 24: 90% bucket 25: 70% bucket 24: 100% Sorting block of length 104276855 for bucket 24 (Using difference cover) bucket 25: 80% Sorting block time: 00:02:23 Returning block of 111721860 for bucket 22 bucket 25: 90% Getting block 26 of 29 Reserving size (130325954) for bucket 26 Calculating Z arrays for bucket 26 Entering block accumulator loop for bucket 26: bucket 25: 100% Sorting block of length 121954001 for bucket 25 (Using difference cover) bucket 26: 10% bucket 26: 20% bucket 26: 30% bucket 26: 40% bucket 26: 50% Sorting block time: 00:01:29 Returning block of 68822365 for bucket 23 Getting block 27 of 29 Reserving size (130325954) for bucket 27 Calculating Z arrays for bucket 27 Entering block accumulator loop for bucket 27: bucket 26: 60% bucket 27: 10% bucket 26: 70% bucket 27: 20% bucket 26: 80% bucket 27: 30% bucket 26: 90% bucket 27: 40% bucket 27: 50% bucket 26: 100% Sorting block of length 90611040 for bucket 26 (Using difference cover) bucket 27: 60% bucket 27: 70% bucket 27: 80% bucket 27: 90% bucket 27: 100% Sorting block of length 117651698 for bucket 27 (Using difference cover) Sorting block time: 00:02:20 Returning block of 104276856 for bucket 24 Getting block 28 of 29 Reserving size (130325954) for bucket 28 Calculating Z arrays for bucket 28 Entering block accumulator loop for bucket 28: bucket 28: 10% bucket 28: 20% bucket 28: 30% bucket 28: 40% Sorting block time: 00:02:39 Returning block of 121954002 for bucket 25 bucket 28: 50% Getting block 29 of 29 Reserving size (130325954) for bucket 29 Calculating Z arrays for bucket 29 Entering block accumulator loop for bucket 29: bucket 29: 10% bucket 28: 60% bucket 29: 20% bucket 28: 70% bucket 29: 30% bucket 29: 40% Sorting block time: 00:01:53 Returning block of 90611041 for bucket 26 bucket 28: 80% bucket 29: 50% bucket 29: 60% bucket 28: 90% bucket 29: 70% bucket 28: 100% Sorting block of length 60214144 for bucket 28 (Using difference cover) bucket 29: 80% bucket 29: 90% bucket 29: 100% Sorting block of length 76707426 for bucket 29 (Using difference cover) Sorting block time: 00:02:30 Returning block of 117651699 for bucket 27 Sorting block time: 00:01:15 Returning block of 60214145 for bucket 28 Sorting block time: 00:01:42 Returning block of 76707427 for bucket 29 Exited Ebwt loop fchr[A]: 0 fchr[C]: 819570787 fchr[G]: 1387714628 fchr[T]: 1958169038 fchr[$]: 2780287016 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 930969014 bytes to primary EBWT file: dbs/indexes/indexes/bowtie2/Homo_sapiens.GRCh38.dna.primary_assembly.1.bt2 Wrote 695071760 bytes to secondary EBWT file: dbs/indexes/indexes/bowtie2/Homo_sapiens.GRCh38.dna.primary_assembly.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 2780287016 bwtLen: 2780287017 sz: 695071754 bwtSz: 695071755 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 173767939 offsSz: 695071756 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 14480662 numLines: 14480662 ebwtTotLen: 926762368 ebwtTotSz: 926762368 color: 0 reverse: 0 Total time for call to driver() for forward index: 00:32:53 Reading reference sizes Time reading reference sizes: 00:00:21 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:19 Time to reverse reference sequence: 00:00:08 bmax according to bmaxDivN setting: 173767938 Using parameters --bmax 130325954 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 130325954 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:56 Allocating rank array Ranking v-sort output Exited GFM loop fchr[A]: 0 fchr[C]: 819570787 fchr[G]: 1387714628 fchr[T]: 1958169038 fchr[$]: 2780287016 Exiting GFM::buildToDisk() Returning from initFromVector Wrote 930969034 bytes to primary GFM file: dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly.1.ht2 Wrote 695071760 bytes to secondary GFM file: dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly.2.ht2 Re-opening _in1 and _in2 as input streams Returning from GFM constructor Ranking v-sort output time: 00:00:32 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:55 Sanity-checking and returning Building samples Reserving space for 44 sample suffixes Generating random suffixes QSorting 44 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 44 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Split 5, merged 20; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 2, merged 2; iterating... Splitting and merging Splitting and merging time: 00:00:00 Split 2, merged 1; iterating... Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 9.5872e+07 (target: 130325953) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 29 Reserving size (130325954) for bucket 1 Getting block 2 of 29 Getting block 3 of 29 Getting block 4 of 29 Calculating Z arrays for bucket 1 Reserving size (130325954) for bucket 2 Reserving size (130325954) for bucket 3 Reserving size (130325954) for bucket 4 Entering block accumulator loop for bucket 1: Calculating Z arrays for bucket 2 Calculating Z arrays for bucket 4 Calculating Z arrays for bucket 3 Entering block accumulator loop for bucket 2: Entering block accumulator loop for bucket 3: Entering block accumulator loop for bucket 4: bucket 1: 10% bucket 2: 10% bucket 3: 10% bucket 4: 10% bucket 1: 20% bucket 2: 20% bucket 3: 20% bucket 4: 20% bucket 1: 30% bucket 2: 30% bucket 3: 30% bucket 4: 30% bucket 1: 40% bucket 2: 40% bucket 3: 40% bucket 1: 50% bucket 4: 40% bucket 2: 50% bucket 3: 50% bucket 1: 60% bucket 4: 50% bucket 2: 60% bucket 1: 70% bucket 3: 60% bucket 2: 70% bucket 4: 60% Returning from initFromVector Wrote 1221451461 bytes to primary GFM file: dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly.5.ht2 Wrote 707807016 bytes to secondary GFM file: dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly.6.ht2 Re-opening _in5 and _in5 as input streams Returning from HierEbwt constructor Headers: len: 2780287016 gbwtLen: 2780287017 nodes: 2780287017 sz: 695071754 gbwtSz: 695071755 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 0 eftabSz: 0 ftabLen: 1048577 ftabSz: 4194308 offsLen: 173767939 offsSz: 695071756 lineSz: 64 sideSz: 64 sideGbwtSz: 48 sideGbwtLen: 192 numSides: 14480662 numLines: 14480662 gbwtTotLen: 926762368 gbwtTotSz: 926762368 reverse: 0 linearFM: Yes Total time for call to driver() for forward index: 00:39:22 bucket 1: 80% bucket 3: 70% bucket 2: 80% bucket 4: 70% bucket 1: 90% bucket 3: 80% bucket 2: 90% bucket 1: 100% Sorting block of length 113392836 for bucket 1 (Using difference cover) bucket 4: 80% bucket 2: 100% Sorting block of length 68226735 for bucket 2 (Using difference cover) bucket 3: 90% bucket 4: 90% bucket 3: 100% Sorting block of length 121629933 for bucket 3 (Using difference cover) bucket 4: 100% Sorting block of length 122215956 for bucket 4 (Using difference cover) Sorting block time: 00:01:33 Returning block of 68226736 for bucket 2 Getting block 5 of 29 Reserving size (130325954) for bucket 5 Calculating Z arrays for bucket 5 Entering block accumulator loop for bucket 5: bucket 5: 10% bucket 5: 20% bucket 5: 30% bucket 5: 40% bucket 5: 50% bucket 5: 60% bucket 5: 70% bucket 5: 80% bucket 5: 90% bucket 5: 100% Sorting block of length 23608915 for bucket 5 (Using difference cover) Sorting block time: 00:02:29 Returning block of 113392837 for bucket 1 Getting block 6 of 29 Reserving size (130325954) for bucket 6 Calculating Z arrays for bucket 6 Entering block accumulator loop for bucket 6: bucket 6: 10% bucket 6: 20% Sorting block time: 00:02:40 Returning block of 121629934 for bucket 3 bucket 6: 30% Getting block 7 of 29 Reserving size (130325954) for bucket 7 Calculating Z arrays for bucket 7 Entering block accumulator loop for bucket 7: Sorting block time: 00:00:34 Returning block of 23608916 for bucket 5 bucket 6: 40% Getting block 8 of 29 Reserving size (130325954) for bucket 8 Calculating Z arrays for bucket 8 Entering block accumulator loop for bucket 8: bucket 7: 10% bucket 6: 50% bucket 8: 10% bucket 7: 20% Sorting block time: 00:02:57 Returning block of 122215957 for bucket 4 bucket 6: 60% bucket 8: 20% bucket 7: 30% Getting block 9 of 29 Reserving size (130325954) for bucket 9 Calculating Z arrays for bucket 9 Entering block accumulator loop for bucket 9: bucket 6: 70% bucket 8: 30% bucket 7: 40% bucket 9: 10% bucket 6: 80% bucket 8: 40% bucket 7: 50% bucket 9: 20% bucket 6: 90% bucket 8: 50% bucket 7: 60% bucket 9: 30% bucket 6: 100% Sorting block of length 123710747 for bucket 6 (Using difference cover) bucket 8: 60% bucket 7: 70% bucket 9: 40% bucket 8: 70% bucket 7: 80% bucket 8: 80% bucket 9: 50% bucket 7: 90% bucket 8: 90% bucket 9: 60% bucket 7: 100% Sorting block of length 130151631 for bucket 7 (Using difference cover) bucket 8: 100% Sorting block of length 99310920 for bucket 8 (Using difference cover) bucket 9: 70% bucket 9: 80% bucket 9: 90% bucket 9: 100% Sorting block of length 98903209 for bucket 9 (Using difference cover) cd /data/samples/S1/processings/hisat2_out && hisat2 -x /data/dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly --seed 123 --dta --dta-cufflinks --rg-id scordant --no-mixed -p 4 -1 /data/LNCAP-AD-rep1_1.fastq -2 /data/LNCAP-AD-rep1_2.fastq 2> /data/samples/S1/processings/hisat2_out/S1_hisat2.log | samtools sort -m 768M -O -T hisat2_S1 > /data/samples/S1/processings/hisat2_out/S1_hisat2.bam && cd /data Sorting block time: 00:02:15 Returning block of 99310921 for bucket 8 Getting block 10 of 29 Reserving size (130325954) for bucket 10 Calculating Z arrays for bucket 10 Entering block accumulator loop for bucket 10: bucket 10: 10% bucket 10: 20% bucket 10: 30% Sorting block time: 00:02:59 Returning block of 123710748 for bucket 6 bucket 10: 40% Getting block 11 of 29 Reserving size (130325954) for bucket 11 Calculating Z arrays for bucket 11 Entering block accumulator loop for bucket 11: bucket 10: 50% bucket 11: 10% Sorting block time: 00:02:24 Returning block of 98903210 for bucket 9 Getting block 12 of 29 Reserving size (130325954) for bucket 12 Calculating Z arrays for bucket 12 Entering block accumulator loop for bucket 12: bucket 10: 60% bucket 11: 20% bucket 12: 10% bucket 10: 70% bucket 11: 30% Sorting block time: 00:03:04 Returning block of 130151632 for bucket 7 bucket 12: 20% Getting block 13 of 29 Reserving size (130325954) for bucket 13 Calculating Z arrays for bucket 13 Entering block accumulator loop for bucket 13: bucket 10: 80% bucket 11: 40% bucket 12: 30% bucket 10: 90% bucket 13: 10% bucket 11: 50% bucket 10: 100% Sorting block of length 107735832 for bucket 10 (Using difference cover) bucket 12: 40% bucket 13: 20% bucket 11: 60% bucket 12: 50% bucket 13: 30% bucket 11: 70% bucket 12: 60% bucket 13: 40% bucket 11: 80% bucket 12: 70% bucket 11: 90% bucket 13: 50% bucket 12: 80% bucket 11: 100% Sorting block of length 124390994 for bucket 11 (Using difference cover) bucket 13: 60% bucket 12: 90% bucket 13: 70% bucket 12: 100% Sorting block of length 90465337 for bucket 12 (Using difference cover) bucket 13: 80% bucket 13: 90% bucket 13: 100% Sorting block of length 107127738 for bucket 13 (Using difference cover) Sorting block time: 00:02:50 Returning block of 107735833 for bucket 10 Getting block 14 of 29 Reserving size (130325954) for bucket 14 Calculating Z arrays for bucket 14 Entering block accumulator loop for bucket 14: bucket 14: 10% bucket 14: 20% Sorting block time: 00:02:26 Returning block of 90465338 for bucket 12 Getting block 15 of 29 Reserving size (130325954) for bucket 15 Calculating Z arrays for bucket 15 Entering block accumulator loop for bucket 15: bucket 14: 30% bucket 15: 10% bucket 14: 40% bucket 15: 20% bucket 14: 50% bucket 15: 30% bucket 14: 60% bucket 15: 40% bucket 15: 50% bucket 14: 70% Sorting block time: 00:02:46 Returning block of 107127739 for bucket 13 Getting block 16 of 29 Reserving size (130325954) for bucket 16 Calculating Z arrays for bucket 16 Entering block accumulator loop for bucket 16: Sorting block time: 00:03:22 Returning block of 124390995 for bucket 11 bucket 15: 60% bucket 14: 80% Getting block 17 of 29 Reserving size (130325954) for bucket 17 Calculating Z arrays for bucket 17 Entering block accumulator loop for bucket 17: bucket 16: 10% bucket 15: 70% bucket 14: 90% bucket 17: 10% bucket 16: 20% bucket 15: 80% bucket 17: 20% bucket 14: 100% Sorting block of length 102858347 for bucket 14 (Using difference cover) bucket 16: 30% bucket 15: 90% bucket 17: 30% bucket 16: 40% bucket 15: 100% Sorting block of length 76467644 for bucket 15 (Using difference cover) bucket 17: 40% bucket 16: 50% bucket 17: 50% bucket 16: 60% bucket 17: 60% bucket 16: 70% bucket 17: 70% bucket 16: 80% bucket 17: 80% bucket 16: 90% bucket 17: 90% bucket 16: 100% Sorting block of length 102065078 for bucket 16 (Using difference cover) bucket 17: 100% Sorting block of length 121733936 for bucket 17 (Using difference cover) Sorting block time: 00:01:59 Returning block of 76467645 for bucket 15 Getting block 18 of 29 Reserving size (130325954) for bucket 18 Calculating Z arrays for bucket 18 Entering block accumulator loop for bucket 18: bucket 18: 10% bucket 18: 20% bucket 18: 30% bucket 18: 40% Sorting block time: 00:02:38 Returning block of 102858348 for bucket 14 Getting block 19 of 29 Reserving size (130325954) for bucket 19 Calculating Z arrays for bucket 19 Entering block accumulator loop for bucket 19: bucket 18: 50% bucket 19: 10% bucket 18: 60% bucket 19: 20% bucket 18: 70% bucket 19: 30% bucket 18: 80% bucket 19: 40% bucket 18: 90% bucket 19: 50% bucket 18: 100% Sorting block of length 69902842 for bucket 18 (Using difference cover) bucket 19: 60% bucket 19: 70% Sorting block time: 00:02:45 Returning block of 102065079 for bucket 16 Getting block 20 of 29 Reserving size (130325954) for bucket 20 Calculating Z arrays for bucket 20 Entering block accumulator loop for bucket 20: bucket 19: 80% bucket 20: 10% bucket 19: 90% bucket 19: 100% Sorting block of length 124768785 for bucket 19 (Using difference cover) bucket 20: 20% bucket 20: 30% Sorting block time: 00:03:14 Returning block of 121733937 for bucket 17 Getting block 21 of 29 Reserving size (130325954) for bucket 21 Calculating Z arrays for bucket 21 Entering block accumulator loop for bucket 21: bucket 20: 40% bucket 21: 10% bucket 20: 50% bucket 21: 20% bucket 20: 60% bucket 21: 30% bucket 20: 70% bucket 21: 40% bucket 21: 50% bucket 20: 80% bucket 21: 60% bucket 20: 90% bucket 21: 70% Sorting block time: 00:01:50 Returning block of 69902843 for bucket 18 bucket 20: 100% Sorting block of length 85439626 for bucket 20 (Using difference cover) Getting block 22 of 29 Reserving size (130325954) for bucket 22 Calculating Z arrays for bucket 22 Entering block accumulator loop for bucket 22: bucket 21: 80% bucket 22: 10% bucket 21: 90% bucket 22: 20% bucket 21: 100% Sorting block of length 116165819 for bucket 21 (Using difference cover) bucket 22: 30% bucket 22: 40% bucket 22: 50% bucket 22: 60% bucket 22: 70% bucket 22: 80% bucket 22: 90% bucket 22: 100% Sorting block of length 114666601 for bucket 22 (Using difference cover) Sorting block time: 00:03:20 Returning block of 124768786 for bucket 19 Sorting block time: 00:02:07 Returning block of 85439627 for bucket 20 Getting block 23 of 29 Reserving size (130325954) for bucket 23 Calculating Z arrays for bucket 23 Entering block accumulator loop for bucket 23: Getting block 24 of 29 Reserving size (130325954) for bucket 24 Calculating Z arrays for bucket 24 Entering block accumulator loop for bucket 24: bucket 23: 10% bucket 24: 10% bucket 23: 20% bucket 24: 20% bucket 23: 30% bucket 24: 30% bucket 24: 40% bucket 23: 40% bucket 24: 50% bucket 23: 50% bucket 24: 60% bucket 23: 60% bucket 24: 70% bucket 23: 70% bucket 24: 80% Sorting block time: 00:02:55 Returning block of 116165820 for bucket 21 bucket 23: 80% Getting block 25 of 29 Reserving size (130325954) for bucket 25 Calculating Z arrays for bucket 25 Entering block accumulator loop for bucket 25: bucket 24: 90% bucket 23: 90% bucket 25: 10% bucket 24: 100% Sorting block of length 88228979 for bucket 24 (Using difference cover) bucket 23: 100% Sorting block of length 116984014 for bucket 23 (Using difference cover) bucket 25: 20% bucket 25: 30% bucket 25: 40% bucket 25: 50% bucket 25: 60% Sorting block time: 00:02:56 Returning block of 114666602 for bucket 22 bucket 25: 70% Getting block 26 of 29 Reserving size (130325954) for bucket 26 Calculating Z arrays for bucket 26 Entering block accumulator loop for bucket 26: bucket 25: 80% bucket 26: 10% bucket 25: 90% bucket 26: 20% bucket 25: 100% Sorting block of length 42686039 for bucket 25 (Using difference cover) bucket 26: 30% bucket 26: 40% bucket 26: 50% bucket 26: 60% bucket 26: 70% bucket 26: 80% bucket 26: 90% bucket 26: 100% Sorting block of length 96575057 for bucket 26 (Using difference cover) Sorting block time: 00:01:04 Returning block of 42686040 for bucket 25 Getting block 27 of 29 Reserving size (130325954) for bucket 27 Calculating Z arrays for bucket 27 Entering block accumulator loop for bucket 27: bucket 27: 10% bucket 27: 20% Sorting block time: 00:02:25 Returning block of 88228980 for bucket 24 Getting block 28 of 29 Reserving size (130325954) for bucket 28 Calculating Z arrays for bucket 28 Entering block accumulator loop for bucket 28: bucket 27: 30% bucket 28: 10% bucket 27: 40% bucket 28: 20% bucket 27: 50% bucket 28: 30% bucket 27: 60% bucket 28: 40% bucket 27: 70% bucket 28: 50% bucket 27: 80% bucket 28: 60% bucket 27: 90% bucket 28: 70% bucket 27: 100% Sorting block of length 52943713 for bucket 27 (Using difference cover) Sorting block time: 00:03:05 Returning block of 116984015 for bucket 23 bucket 28: 80% Getting block 29 of 29 Reserving size (130325954) for bucket 29 Calculating Z arrays for bucket 29 Entering block accumulator loop for bucket 29: bucket 29: 10% bucket 28: 90% bucket 29: 20% bucket 29: 30% bucket 28: 100% Sorting block of length 97641280 for bucket 28 (Using difference cover) bucket 29: 40% bucket 29: 50% bucket 29: 60% bucket 29: 70% bucket 29: 80% bucket 29: 90% bucket 29: 100% Sorting block of length 40288445 for bucket 29 (Using difference cover) Sorting block time: 00:02:22 Returning block of 96575058 for bucket 26 Sorting block time: 00:01:20 Returning block of 52943714 for bucket 27 Sorting block time: 00:00:58 Returning block of 40288446 for bucket 29 Sorting block time: 00:02:25 Returning block of 97641281 for bucket 28 Exited Ebwt loop fchr[A]: 0 fchr[C]: 819570787 fchr[G]: 1387714628 fchr[T]: 1958169038 fchr[$]: 2780287016 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 930969014 bytes to primary EBWT file: dbs/indexes/indexes/bowtie2/Homo_sapiens.GRCh38.dna.primary_assembly.rev.1.bt2 Wrote 695071760 bytes to secondary EBWT file: dbs/indexes/indexes/bowtie2/Homo_sapiens.GRCh38.dna.primary_assembly.rev.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 2780287016 bwtLen: 2780287017 sz: 695071754 bwtSz: 695071755 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 173767939 offsSz: 695071756 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 14480662 numLines: 14480662 ebwtTotLen: 926762368 ebwtTotSz: 926762368 color: 0 reverse: 1 Total time for backward call to driver() for mirror index: 00:33:17 cd /data/samples/S2/processings/hisat2_out && hisat2 -x /data/dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly --seed 123 --dta --dta-cufflinks --rg-id scordant --no-mixed -p 4 -1 /data/LNCAP-AD-rep2_1.fastq -2 /data/LNCAP-AD-rep2_2.fastq 2> /data/samples/S2/processings/hisat2_out/S2_hisat2.log | samtools sort -m 768M -O -T hisat2_S2 > /data/samples/S2/processings/hisat2_out/S2_hisat2.bam && cd /data [bam_sort_core] merging from 36 files and 4 in-memory blocks... samtools index samples/S1/processings/hisat2_out/S1_hisat2.bam samples/S1/processings/hisat2_out/S1_hisat2.bam.bai samtools fastq -f 12 -F 3328 -n -s /data/samples/S1/processings/unmapped_reads/singleton.fastq -1 /data/samples/S1/processings/unmapped_reads/unmapped_1.fastq -2 /data/samocessings/unmapped_reads/unmapped_2.fastq samples/S1/processings/hisat2_out/S1_hisat2.bam && gzip /data/samples/S1/processings/unmapped_reads/unmapped_1.fastq && gzip /datS1/processings/unmapped_reads/unmapped_2.fastq && gzip /data/samples/S1/processings/unmapped_reads/singleton.fastq [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 7652970 reads trim_read_header.py -s '\' -f samples/S1/processings/unmapped_reads/unmapped_1.fastq.gz -r samples/S1/processings/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S1gs/circRNAs/S1.unmappedSE.fq.gz unmapped2anchors.py -Q <( zcat samples/S1/processings/circRNAs/S1.unmappedSE.fq.gz ) | bowtie2 --seed 123 -p 4 --reorder --score-min=C,-15,0 -q -x /data/dbs/indexes/indexeHomo_sapiens.GRCh38.dna.primary_assembly -U - 2> samples/S1/processings/circRNAs/find_circ_out/bt2_secondpass.log | find_circ.py -G /data/Homo_sapiens.GRCh38.dna.primarya -p S1 -s samples/S1/processings/circRNAs/find_circ_out/find_circ.log -R samples/S1/processings/circRNAs/find_circ_out/sites.reads > samples/S1/processings/circRNAs/find_ites.bed [bam_sort_core] merging from 28 files and 4 in-memory blocks... samtools index samples/S2/processings/hisat2_out/S2_hisat2.bam samples/S2/processings/hisat2_out/S2_hisat2.bam.bai samtools fastq -f 12 -F 3328 -n -s /data/samples/S2/processings/unmapped_reads/singleton.fastq -1 /data/samples/S2/processings/unmapped_reads/unmapped_1.fastq -2 /data/samocessings/unmapped_reads/unmapped_2.fastq samples/S2/processings/hisat2_out/S2_hisat2.bam && gzip /data/samples/S2/processings/unmapped_reads/unmapped_1.fastq && gzip /datS2/processings/unmapped_reads/unmapped_2.fastq && gzip /data/samples/S2/processings/unmapped_reads/singleton.fastq [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 6829328 reads trim_read_header.py -s '\' -f samples/S2/processings/unmapped_reads/unmapped_1.fastq.gz -r samples/S2/processings/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S2gs/circRNAs/S2.unmappedSE.fq.gz unmapped2anchors.py -Q <( zcat samples/S2/processings/circRNAs/S2.unmappedSE.fq.gz ) | bowtie2 --seed 123 -p 4 --reorder --score-min=C,-15,0 -q -x /data/dbs/indexes/indexe_sapiens.GRCh38.dna.primary_assembly -U - 2> samples/S2/processings/circRNAs/find_circ_out/bt2_secondpass.log | find_circ.py -G /data/Homo_sapiens.GRCh38.dna.primary_assem-s samples/S2/processings/circRNAs/find_circ_out/find_circ.log -R samples/S2/processings/circRNAs/find_circ_out/sites.reads > samples/S2/processings/circRNAs/find_circ_out filter_findcirc_res.R -i samples/S1/processings/circRNAs/find_circ_out/sites.bed -o samples/S1/processings/circRNAs/find_circ_out/circ_candidates.bed -q 40 -f CIRCULAR,UNAP,ANCHOR_UNIQUE cd /data/samples/S3/processings/hisat2_out && hisat2 -x /data/dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly --seed 123 --dta --dta-cufflinks --rg-id scordant --no-mixed -p 4 -1 /data/LNCAP-AI-rep1_1.fastq -2 /data/LNCAP-AI-rep1_2.fastq 2> /data/samples/S3/processings/hisat2_out/S3_hisat2.log | samtools sort -m 768M -O -T hisat2_S3 > /data/samples/S3/processings/hisat2_out/S3_hisat2.bam && cd /data filter_findcirc_res.R -i samples/S2/processings/circRNAs/find_circ_out/sites.bed -o samples/S2/processings/circRNAs/find_circ_out/circ_candidates.bed -q 40 -f CIRCULAR,UNAP,ANCHOR_UNIQUE cd /data/samples/S4/processings/hisat2_out && hisat2 -x /data/dbs/indexes/indexes/hisat2/Homo_sapiens.GRCh38.dna.primary_assembly --seed 123 --dta --dta-cufflinks --rg-id scordant --no-mixed -p 4 -1 /data/LNCAP-AI-rep2_1.fastq -2 /data/LNCAP-AI-rep2_2.fastq 2> /data/samples/S4/processings/hisat2_out/S4_hisat2.log | samtools sort -m 768M -O -T hisat2_S4 > /data/samples/S4/processings/hisat2_out/S4_hisat2.bam && cd /data [bam_sort_core] merging from 32 files and 4 in-memory blocks... samtools index samples/S3/processings/hisat2_out/S3_hisat2.bam samples/S3/processings/hisat2_out/S3_hisat2.bam.bai samtools fastq -f 12 -F 3328 -n -s /data/samples/S3/processings/unmapped_reads/singleton.fastq -1 /data/samples/S3/processings/unmapped_reads/unmapped_1.fastq -2 /data/samrocessings/unmapped_reads/unmapped_2.fastq samples/S3/processings/hisat2_out/S3_hisat2.bam && gzip /data/samples/S3/processings/unmapped_reads/unmapped_1.fastq && gzip /das/S3/processings/unmapped_reads/unmapped_2.fastq && gzip /data/samples/S3/processings/unmapped_reads/singleton.fastq [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 6735042 reads trim_read_header.py -s '\' -f samples/S3/processings/unmapped_reads/unmapped_1.fastq.gz -r samples/S3/processings/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S3ngs/circRNAs/S3.unmappedSE.fq.gz unmapped2anchors.py -Q <( zcat samples/S3/processings/circRNAs/S3.unmappedSE.fq.gz ) | bowtie2 --seed 123 -p 4 --reorder --score-min=C,-15,0 -q -x /data/dbs/indexes/indexe/Homo_sapiens.GRCh38.dna.primary_assembly -U - 2> samples/S3/processings/circRNAs/find_circ_out/bt2_secondpass.log | find_circ.py -G /data/Homo_sapiens.GRCh38.dna.primaryfa -p S3 -s samples/S3/processings/circRNAs/find_circ_out/find_circ.log -R samples/S3/processings/circRNAs/find_circ_out/sites.reads > samples/S3/processings/circRNAs/fint/sites.bed [bam_sort_core] merging from 28 files and 4 in-memory blocks... samtools index samples/S4/processings/hisat2_out/S4_hisat2.bam samples/S4/processings/hisat2_out/S4_hisat2.bam.bai samtools fastq -f 12 -F 3328 -n -s /data/samples/S4/processings/unmapped_reads/singleton.fastq -1 /data/samples/S4/processings/unmapped_reads/unmapped_1.fastq -2 /data/samessings/unmapped_reads/unmapped_2.fastq samples/S4/processings/hisat2_out/S4_hisat2.bam && gzip /data/samples/S4/processings/unmapped_reads/unmapped_1.fastq && gzip /data/rocessings/unmapped_reads/unmapped_2.fastq && gzip /data/samples/S4/processings/unmapped_reads/singleton.fastq [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 8851198 reads trim_read_header.py -s '\' -f samples/S4/processings/unmapped_reads/unmapped_1.fastq.gz -r samples/S4/processings/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S4/circRNAs/S4.unmappedSE.fq.gz unmapped2anchors.py -Q <( zcat samples/S4/processings/circRNAs/S4.unmappedSE.fq.gz ) | bowtie2 --seed 123 -p 4 --reorder --score-min=C,-15,0 -q -x /data/dbs/indexes/indexemo_sapiens.GRCh38.dna.primary_assembly -U - 2> samples/S4/processings/circRNAs/find_circ_out/bt2_secondpass.log | find_circ.py -G /data/Homo_sapiens.GRCh38.dna.primaryassS4 -s samples/S4/processings/circRNAs/find_circ_out/find_circ.log -R samples/S4/processings/circRNAs/find_circ_out/sites.reads > samples/S4/processings/circRNAs/find_circed filter_findcirc_res.R -i samples/S3/processings/circRNAs/find_circ_out/sites.bed -o samples/S3/processings/circRNAs/find_circ_out/circ_candidates.bed -q 40 -f CIRCULAR,UNAANCHOR_UNIQUE gtfToGenePred -infoOut=dbs/indexes/genePred.transcripts.info Homo_sapiens.GRCh38.104.gtf dbs/indexes/Homo_sapiens.GRCh38.104.genePred cut -f2 dbs/indexes/genePred.transcripts.info | grep -v geneId | paste - dbs/indexes/Homosapiens.GRCh38.104.genePred | sed "s^\t([^\t])\t(.)\1\t\1\t\2" > dbs/indpiens.GRCh38.104.genePred.wgn SymLink(["dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly.fa"], ["Homo_sapiens.GRCh38.dna.primary_assembly.fa"]) bowtie-build -f --seed 1 /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly Settings: Output files: "dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly..ebwt" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 5 (one in 32) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 1 Sizeofs: void:8, int:4, long:8, size_t:8 Input files DNA, FASTA: /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa Reading reference sizes Time reading reference sizes: 00:00:29 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:30 bmax according to bmaxDivN setting: 695071754 Using parameters --bmax 521303816 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 521303816 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:01:56 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:34 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:51 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:01:51 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 6; iterating... Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:01:39 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 1; iterating... Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:01:38 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 3.47536e+08 (target: 521303815) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:30 Sorting block of length 390727629 (Using difference cover) filter_findcirc_res.R -i samples/S4/processings/circRNAs/find_circ_out/sites.bed -o samples/S4/processings/circRNAs/find_circ_out/circ_candidates.bed -q 40 -f CIRCULAR,UNABP,ANCHOR_UNIQUE findcirc_compare.R -l S1,S2,S3,S4 -i samples/S1/processings/circRNAs/find_circ_out/circ_candidates.bed,samples/S2/processings/circRNAs/find_circ_out/circ_candidates.bed,saprocessings/circRNAs/find_circ_out/circ_candidates.bed,samples/S4/processings/circRNAs/find_circ_out/circ_candidates.bed -o circular_expression/circrna_collection/merged_srcrnas/find_circ_compared.csv bwa index -a bwtsw -p dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa [bwa_index] Pack FASTA... 32.76 sec [bwa_index] Construct BWT for the packed sequence... [BWTIncCreate] textLength=5798579604, availableWord=420009088 [BWTIncConstructFromPacked] 10 iterations done. 99999988 characters processed. [BWTIncConstructFromPacked] 20 iterations done. 199999988 characters processed. [BWTIncConstructFromPacked] 30 iterations done. 299999988 characters processed. [BWTIncConstructFromPacked] 40 iterations done. 399999988 characters processed. [BWTIncConstructFromPacked] 50 iterations done. 499999988 characters processed. Sorting block time: 00:08:23 Returning block of 390727630 [BWTIncConstructFromPacked] 60 iterations done. 599999988 characters processed. [BWTIncConstructFromPacked] 70 iterations done. 699999988 characters processed. [BWTIncConstructFromPacked] 80 iterations done. 799999988 characters processed. Getting block 2 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:34 Sorting block of length 287025447 (Using difference cover) [BWTIncConstructFromPacked] 90 iterations done. 899999988 characters processed. [BWTIncConstructFromPacked] 100 iterations done. 999999988 characters processed. [BWTIncConstructFromPacked] 110 iterations done. 1099999988 characters processed. [BWTIncConstructFromPacked] 120 iterations done. 1199999988 characters processed. [BWTIncConstructFromPacked] 130 iterations done. 1299999988 characters processed. [BWTIncConstructFromPacked] 140 iterations done. 1399999988 characters processed. [BWTIncConstructFromPacked] 150 iterations done. 1499999988 characters processed. [BWTIncConstructFromPacked] 160 iterations done. 1599999988 characters processed. [BWTIncConstructFromPacked] 170 iterations done. 1699999988 characters processed. Sorting block time: 00:06:23 Returning block of 287025448 [BWTIncConstructFromPacked] 180 iterations done. 1799999988 characters processed. [BWTIncConstructFromPacked] 190 iterations done. 1899999988 characters processed. Getting block 3 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:41 Sorting block of length 388741822 (Using difference cover) [BWTIncConstructFromPacked] 200 iterations done. 1999999988 characters processed. [BWTIncConstructFromPacked] 210 iterations done. 2099999988 characters processed. [BWTIncConstructFromPacked] 220 iterations done. 2199999988 characters processed. [BWTIncConstructFromPacked] 230 iterations done. 2299999988 characters processed. [BWTIncConstructFromPacked] 240 iterations done. 2399999988 characters processed. [BWTIncConstructFromPacked] 250 iterations done. 2499999988 characters processed. [BWTIncConstructFromPacked] 260 iterations done. 2599999988 characters processed. [BWTIncConstructFromPacked] 270 iterations done. 2699999988 characters processed. [BWTIncConstructFromPacked] 280 iterations done. 2799999988 characters processed. [BWTIncConstructFromPacked] 290 iterations done. 2899999988 characters processed. [BWTIncConstructFromPacked] 300 iterations done. 2999999988 characters processed. Sorting block time: 00:08:56 Returning block of 388741823 [BWTIncConstructFromPacked] 310 iterations done. 3099999988 characters processed. [BWTIncConstructFromPacked] 320 iterations done. 3199999988 characters processed. Getting block 4 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% [BWTIncConstructFromPacked] 330 iterations done. 3299999988 characters processed. 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:48 Sorting block of length 326786289 (Using difference cover) [BWTIncConstructFromPacked] 340 iterations done. 3399999988 characters processed. [BWTIncConstructFromPacked] 350 iterations done. 3499999988 characters processed. [BWTIncConstructFromPacked] 360 iterations done. 3599999988 characters processed. [BWTIncConstructFromPacked] 370 iterations done. 3699999988 characters processed. [BWTIncConstructFromPacked] 380 iterations done. 3799999988 characters processed. [BWTIncConstructFromPacked] 390 iterations done. 3899999988 characters processed. [BWTIncConstructFromPacked] 400 iterations done. 3999999988 characters processed. [BWTIncConstructFromPacked] 410 iterations done. 4099999988 characters processed. [BWTIncConstructFromPacked] 420 iterations done. 4199999988 characters processed. Sorting block time: 00:07:55 Returning block of 326786290 [BWTIncConstructFromPacked] 430 iterations done. 4299999988 characters processed. [BWTIncConstructFromPacked] 440 iterations done. 4399999988 characters processed. Getting block 5 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% [BWTIncConstructFromPacked] 450 iterations done. 4499999988 characters processed. 100% Block accumulator loop time: 00:00:45 Sorting block of length 454119924 (Using difference cover) [BWTIncConstructFromPacked] 460 iterations done. 4599999988 characters processed. [BWTIncConstructFromPacked] 470 iterations done. 4699999988 characters processed. [BWTIncConstructFromPacked] 480 iterations done. 4799999988 characters processed. [BWTIncConstructFromPacked] 490 iterations done. 4899999988 characters processed. [BWTIncConstructFromPacked] 500 iterations done. 4999999988 characters processed. [BWTIncConstructFromPacked] 510 iterations done. 5099999988 characters processed. [BWTIncConstructFromPacked] 520 iterations done. 5196637460 characters processed. [BWTIncConstructFromPacked] 530 iterations done. 5282805924 characters processed. [BWTIncConstructFromPacked] 540 iterations done. 5359388692 characters processed. [BWTIncConstructFromPacked] 550 iterations done. 5427451636 characters processed. [BWTIncConstructFromPacked] 560 iterations done. 5487942148 characters processed. [BWTIncConstructFromPacked] 570 iterations done. 5541702212 characters processed. Sorting block time: 00:10:54 Returning block of 454119925 [BWTIncConstructFromPacked] 580 iterations done. 5589480308 characters processed. [BWTIncConstructFromPacked] 590 iterations done. 5631941556 characters processed. [BWTIncConstructFromPacked] 600 iterations done. 5669677188 characters processed. [BWTIncConstructFromPacked] 610 iterations done. 5703212708 characters processed. [BWTIncConstructFromPacked] 620 iterations done. 5733015140 characters processed. [BWTIncConstructFromPacked] 630 iterations done. 5759499556 characters processed. Getting block 6 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% [BWTIncConstructFromPacked] 640 iterations done. 5783034964 characters processed. 30% 40% 50% 60% [bwt_gen] Finished constructing BWT in 648 iterations. [bwa_index] 3038.16 seconds elapse. [bwa_index] Update BWT... 70% 80% 90% 100% Block accumulator loop time: 00:00:44 Sorting block of length 252437129 (Using difference cover) 33.29 sec [bwa_index] Pack forward-only FASTA... 20.10 sec [bwa_index] Construct SA from BWT and Occ... Sorting block time: 00:05:20 Returning block of 252437130 Getting block 7 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:41 Sorting block of length 373545292 (Using difference cover) Sorting block time: 00:08:31 Returning block of 373545293 Getting block 8 of 8 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:32 Sorting block of length 306903477 (Using difference cover) Sorting block time: 00:07:00 Returning block of 306903478 Exited Ebwt loop fchr[A]: 0 fchr[C]: 819570787 fchr[G]: 1387714628 fchr[T]: 1958169038 fchr[$]: 2780287016 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 798574390 bytes to primary EBWT file: dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly.1.ebwt Wrote 347535884 bytes to secondary EBWT file: dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly.2.ebwt Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 2780287016 bwtLen: 2780287017 sz: 695071754 bwtSz: 695071755 lineRate: 6 linesPerSide: 1 offRate: 5 offMask: 0xffffffe0 isaRate: -1 isaMask: 0xffffffff ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 86883970 offsSz: 347535880 isaLen: 0 isaSz: 0 lineSz: 64 sideSz: 64 sideBwtSz: 56 sideBwtLen: 224 numSidePairs: 6205998 numSides: 12411996 numLines: 12411996 ebwtTotLen: 794367744 ebwtTotSz: 794367744 reverse: 0 Total time for call to driver() for forward index: 01:32:06 Reading reference sizes Time reading reference sizes: 00:00:17 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:39 bmax according to bmaxDivN setting: 695071754 Using parameters --bmax 521303816 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 521303816 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:01:55 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:32 Invoking Larsson-Sadakane on ranks 1613.77 sec [main] Version: 0.7.15-r1140 [main] CMD: bwa index -a bwtsw -p dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa [main] Real time: 4960.413 sec; CPU: 4738.090 sec bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S1/processings/unmapped_reads/unmapped_1.fastq.gz /data/samples/S1/gs/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S1/processings/circRNAs/bwa_out/S1_bwa.sam.gz [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 530256 sequences (40000073 bp)... [M::process] read 530278 sequences (40000042 bp)... Invoking Larsson-Sadakane on ranks time: 00:00:55 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (33, 77988, 812, 8) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (24, 33, 95) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 237) [M::mem_pestat] mean and std.dev: (39.15, 29.83) [M::mem_pestat] low and high boundaries for proper pairs: (1, 308) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (118, 165, 346) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 802) [M::mem_pestat] mean and std.dev: (182.00, 124.65) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1030) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (341, 1372, 3721) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10481) [M::mem_pestat] mean and std.dev: (2400.81, 2584.07) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13861) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF 20% 30% 40% [M::mem_process_seqs] Processed 530256 reads in 295.398 CPU sec, 73.675 real sec 50% 60% [M::process] read 530324 sequences (40000097 bp)... 70% 80% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (42, 69972, 865, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (22, 50, 125) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 331) [M::mem_pestat] mean and std.dev: (59.65, 57.32) [M::mem_pestat] low and high boundaries for proper pairs: (1, 434) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 157, 252) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 522) [M::mem_pestat] mean and std.dev: (163.08, 79.04) [M::mem_pestat] low and high boundaries for proper pairs: (1, 657) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (449, 1479, 3757) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10373) [M::mem_pestat] mean and std.dev: (2508.92, 2632.26) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13681) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF 90% 100% Binary sorting into buckets time: 00:01:50 Splitting and merging Splitting and merging time: 00:00:00 Split 2, merged 7; iterating... Binary sorting into buckets 10% [M::mem_process_seqs] Processed 530278 reads in 270.369 CPU sec, 67.412 real sec 20% [M::process] read 530250 sequences (40000012 bp)... 30% 40% 50% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (49, 69475, 826, 14) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (33, 61, 289) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 801) [M::mem_pestat] mean and std.dev: (76.17, 113.61) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1057) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 156, 247) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 507) [M::mem_pestat] mean and std.dev: (162.12, 77.15) [M::mem_pestat] low and high boundaries for proper pairs: (1, 637) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (299, 1129, 3392) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9578) [M::mem_pestat] mean and std.dev: (2162.15, 2419.26) [M::mem_pestat] low and high boundaries for proper pairs: (1, 12671) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (33, 52, 93) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 213) [M::mem_pestat] mean and std.dev: (46.55, 22.01) [M::mem_pestat] low and high boundaries for proper pairs: (1, 273) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR 60% 70% [M::mem_process_seqs] Processed 530324 reads in 268.432 CPU sec, 66.872 real sec 80% 90% [M::process] read 530264 sequences (40000007 bp)... 100% Binary sorting into buckets time: 00:01:38 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 2; iterating... Binary sorting into buckets 10% 20% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (46, 68796, 902, 10) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (29, 71, 181) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 485) [M::mem_pestat] mean and std.dev: (72.84, 76.46) [M::mem_pestat] low and high boundaries for proper pairs: (1, 637) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 155, 245) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 501) [M::mem_pestat] mean and std.dev: (161.14, 75.79) [M::mem_pestat] low and high boundaries for proper pairs: (1, 629) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (331, 1252, 3929) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11125) [M::mem_pestat] mean and std.dev: (2477.72, 2719.93) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14723) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (37, 614, 1235) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 3631) [M::mem_pestat] mean and std.dev: (486.44, 541.02) [M::mem_pestat] low and high boundaries for proper pairs: (1, 4829) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR 30% 40% [M::mem_process_seqs] Processed 530250 reads in 267.445 CPU sec, 66.624 real sec 50% 60% [M::process] read 530264 sequences (40000090 bp)... 70% 80% 90% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (49, 67491, 874, 7) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (27, 66, 271) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 759) [M::mem_pestat] mean and std.dev: (94.20, 128.26) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1003) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 155, 242) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 492) [M::mem_pestat] mean and std.dev: (161.66, 75.21) [M::mem_pestat] low and high boundaries for proper pairs: (1, 617) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (332, 1171, 3389) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9503) [M::mem_pestat] mean and std.dev: (2210.64, 2487.41) [M::mem_pestat] low and high boundaries for proper pairs: (1, 12560) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF 100% Binary sorting into buckets time: 00:01:35 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 3.97184e+08 (target: 521303815) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 7 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% [M::mem_process_seqs] Processed 530264 reads in 267.157 CPU sec, 66.579 real sec 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:32 Sorting block of length 519579663 (Using difference cover) [M::process] read 530252 sequences (40000148 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (37, 68539, 844, 12) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (25, 48, 120) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 310) [M::mem_pestat] mean and std.dev: (46.63, 41.69) [M::mem_pestat] low and high boundaries for proper pairs: (1, 405) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 156, 243) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 495) [M::mem_pestat] mean and std.dev: (162.29, 75.85) [M::mem_pestat] low and high boundaries for proper pairs: (1, 621) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (210, 1103, 3384) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9732) [M::mem_pestat] mean and std.dev: (2208.04, 2635.00) [M::mem_pestat] low and high boundaries for proper pairs: (1, 12906) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (51, 112, 579) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1635) [M::mem_pestat] mean and std.dev: (221.82, 279.87) [M::mem_pestat] low and high boundaries for proper pairs: (1, 2163) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 530264 reads in 271.505 CPU sec, 67.638 real sec [M::process] read 530266 sequences (40000061 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (52, 67793, 894, 7) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (24, 44, 118) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 306) [M::mem_pestat] mean and std.dev: (51.40, 49.61) [M::mem_pestat] low and high boundaries for proper pairs: (1, 400) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 155, 242) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 492) [M::mem_pestat] mean and std.dev: (161.84, 75.58) [M::mem_pestat] low and high boundaries for proper pairs: (1, 617) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (439, 1581, 4221) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11785) [M::mem_pestat] mean and std.dev: (2671.14, 2738.96) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15567) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530252 reads in 276.182 CPU sec, 68.832 real sec [M::process] read 530250 sequences (40000048 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (49, 68314, 905, 8) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (20, 68, 226) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 638) [M::mem_pestat] mean and std.dev: (78.28, 104.83) [M::mem_pestat] low and high boundaries for proper pairs: (1, 844) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (116, 155, 242) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 494) [M::mem_pestat] mean and std.dev: (161.54, 75.65) [M::mem_pestat] low and high boundaries for proper pairs: (1, 620) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (378, 1535, 3996) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11232) [M::mem_pestat] mean and std.dev: (2513.08, 2611.48) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14850) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530266 reads in 277.270 CPU sec, 69.125 real sec [M::process] read 530344 sequences (40000100 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (42, 68145, 910, 8) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (27, 59, 128) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 330) [M::mem_pestat] mean and std.dev: (54.41, 38.49) [M::mem_pestat] low and high boundaries for proper pairs: (1, 431) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 156, 242) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 492) [M::mem_pestat] mean and std.dev: (162.26, 75.89) [M::mem_pestat] low and high boundaries for proper pairs: (1, 617) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (347, 1454, 4065) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11501) [M::mem_pestat] mean and std.dev: (2539.88, 2706.73) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15219) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530250 reads in 274.741 CPU sec, 68.441 real sec [M::process] read 530292 sequences (40000104 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (57, 67729, 911, 8) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (30, 53, 178) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 474) [M::mem_pestat] mean and std.dev: (67.74, 81.12) [M::mem_pestat] low and high boundaries for proper pairs: (1, 622) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (116, 155, 240) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 488) [M::mem_pestat] mean and std.dev: (161.22, 75.07) [M::mem_pestat] low and high boundaries for proper pairs: (1, 612) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (319, 1325, 3717) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10513) [M::mem_pestat] mean and std.dev: (2406.11, 2654.50) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13911) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530344 reads in 275.912 CPU sec, 68.738 real sec [M::process] read 530262 sequences (40000008 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (41, 68784, 890, 9) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (20, 55, 174) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 482) [M::mem_pestat] mean and std.dev: (79.11, 102.44) [M::mem_pestat] low and high boundaries for proper pairs: (1, 636) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (116, 155, 238) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 482) [M::mem_pestat] mean and std.dev: (160.51, 74.12) [M::mem_pestat] low and high boundaries for proper pairs: (1, 604) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (303, 1329, 3311) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9327) [M::mem_pestat] mean and std.dev: (2180.70, 2392.55) [M::mem_pestat] low and high boundaries for proper pairs: (1, 12335) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530292 reads in 273.685 CPU sec, 68.198 real sec [M::process] read 530312 sequences (40000066 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (52, 67736, 935, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (22, 53, 93) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 235) [M::mem_pestat] mean and std.dev: (45.93, 31.88) [M::mem_pestat] low and high boundaries for proper pairs: (1, 306) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (116, 155, 239) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 485) [M::mem_pestat] mean and std.dev: (161.16, 74.75) [M::mem_pestat] low and high boundaries for proper pairs: (1, 608) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (384, 1505, 4088) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11496) [M::mem_pestat] mean and std.dev: (2557.97, 2669.94) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15200) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530262 reads in 274.398 CPU sec, 68.415 real sec [M::process] read 530288 sequences (40000024 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (73, 67709, 924, 8) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (28, 62, 508) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1468) [M::mem_pestat] mean and std.dev: (197.98, 316.81) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1948) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 155, 240) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 486) [M::mem_pestat] mean and std.dev: (161.39, 75.11) [M::mem_pestat] low and high boundaries for proper pairs: (1, 609) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (271, 1227, 3532) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10054) [M::mem_pestat] mean and std.dev: (2312.37, 2619.74) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13315) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530312 reads in 276.905 CPU sec, 69.027 real sec [M::process] read 229068 sequences (17279002 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (45, 68009, 893, 8) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (30, 46, 99) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 237) [M::mem_pestat] mean and std.dev: (49.13, 36.24) [M::mem_pestat] low and high boundaries for proper pairs: (1, 306) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (117, 155, 238) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 480) [M::mem_pestat] mean and std.dev: (160.54, 73.28) [M::mem_pestat] low and high boundaries for proper pairs: (1, 601) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (336, 1424, 3794) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10710) [M::mem_pestat] mean and std.dev: (2471.55, 2676.72) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14168) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530288 reads in 274.069 CPU sec, 68.501 real sec [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (27, 29865, 368, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (34, 60, 131) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 325) [M::mem_pestat] mean and std.dev: (63.30, 49.75) [M::mem_pestat] low and high boundaries for proper pairs: (1, 422) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (116, 155, 244) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 500) [M::mem_pestat] mean and std.dev: (161.89, 76.43) [M::mem_pestat] low and high boundaries for proper pairs: (1, 628) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (300, 1656, 3966) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11298) [M::mem_pestat] mean and std.dev: (2488.70, 2626.73) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14964) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 229068 reads in 121.425 CPU sec, 30.333 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S1/processings/unmapped_reads/unmapped_1.fastq.gz /dataS1/processings/unmapped_reads/unmapped_2.fastq.gz [main] Real time: 998.910 sec; CPU: 3969.081 sec CIRCexplorer2 parse -b samples/S1/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWA samples/S1/processings/circRNAs/bwa_out/S1_bwa.sam.gz | tee samplcessings/circRNAs/CIRCexplorer2_bwa/CIRCexplorer2_bwa.log CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 parse -b samples/S1/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWAS1/processings/circRNAs/bwa_out/S1_bwa.sam.gz Start CIRCexplorer2 parse at 16:44:47 Start parsing fusion junctions from BWA... Converted 74768 fusion reads! End CIRCexplorer2 parse at 16:45:26 zcat samples/S1/processings/circRNAs/bwa_out/S1_bwa.sam.gz | samtools view -F 4 - | cut -f 1 | sort | uniq | wc -l > samples/S1/processings/circRNAs/bwa_out/BWA_mapped_reatxt Sorting block time: 00:11:40 Returning block of 519579664 cd samples/S1/processings/circRNAs/CIRCexplorer2_bwa/annotate && CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRCrimary_assembly.fa -b /data/samples/S1/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt | tee CIRCexplorer2_bwa_annotate.log && cd CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRprimary_assembly.fa -b /data/samples/S1/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt Start CIRCexplorer2 annotate at 16:45:47 Start to annotate fusion junctions... Annotated 6763 fusion junctions! Start to fix fusion junctions... Fixed 3980 fusion junctions! End CIRCexplorer2 annotate at 16:46:11 bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S2/processings/unmapped_reads/unmapped_1.fastq.gz /data/samples/S2/gs/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S2/processings/circRNAs/bwa_out/S2_bwa.sam.gz [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 530138 sequences (40000125 bp)... [M::process] read 530144 sequences (40000023 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (28, 73766, 729, 10) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (43, 76, 137) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 325) [M::mem_pestat] mean and std.dev: (85.15, 64.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 419) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (121, 168, 349) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 805) [M::mem_pestat] mean and std.dev: (185.12, 125.38) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1033) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (285, 1167, 3311) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9363) [M::mem_pestat] mean and std.dev: (2066.11, 2306.84) [M::mem_pestat] low and high boundaries for proper pairs: (1, 12389) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (27, 47, 109) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 273) [M::mem_pestat] mean and std.dev: (47.25, 33.66) [M::mem_pestat] low and high boundaries for proper pairs: (1, 355) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 530138 reads in 297.819 CPU sec, 74.408 real sec [M::process] read 530134 sequences (40000035 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (47, 66062, 815, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (31, 57, 195) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 523) [M::mem_pestat] mean and std.dev: (93.93, 100.02) [M::mem_pestat] low and high boundaries for proper pairs: (1, 687) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 160, 256) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 530) [M::mem_pestat] mean and std.dev: (165.84, 79.51) [M::mem_pestat] low and high boundaries for proper pairs: (1, 667) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (240, 1399, 3668) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10524) [M::mem_pestat] mean and std.dev: (2432.25, 2672.69) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13952) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF Getting block 2 of 7 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% [M::mem_process_seqs] Processed 530144 reads in 275.249 CPU sec, 68.566 real sec 60% 70% 80% 90% [M::process] read 530140 sequences (40000042 bp)... 100% Block accumulator loop time: 00:00:40 Sorting block of length 508405855 (Using difference cover) [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (42, 65311, 854, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (31, 59, 260) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 718) [M::mem_pestat] mean and std.dev: (97.37, 151.10) [M::mem_pestat] low and high boundaries for proper pairs: (1, 947) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 160, 251) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 515) [M::mem_pestat] mean and std.dev: (166.14, 78.19) [M::mem_pestat] low and high boundaries for proper pairs: (1, 647) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (261, 1363, 3508) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10002) [M::mem_pestat] mean and std.dev: (2304.88, 2550.49) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13249) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530134 reads in 277.174 CPU sec, 69.099 real sec [M::process] read 530106 sequences (40000073 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (46, 64043, 816, 9) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (32, 76, 162) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 422) [M::mem_pestat] mean and std.dev: (89.78, 86.82) [M::mem_pestat] low and high boundaries for proper pairs: (1, 552) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 159, 249) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 509) [M::mem_pestat] mean and std.dev: (165.40, 77.65) [M::mem_pestat] low and high boundaries for proper pairs: (1, 639) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (350, 1359, 3955) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11165) [M::mem_pestat] mean and std.dev: (2495.11, 2673.13) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14770) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530140 reads in 281.336 CPU sec, 70.170 real sec [M::process] read 530160 sequences (40000064 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (43, 63517, 820, 9) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (21, 58, 192) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 534) [M::mem_pestat] mean and std.dev: (92.11, 120.50) [M::mem_pestat] low and high boundaries for proper pairs: (1, 705) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 159, 244) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 492) [M::mem_pestat] mean and std.dev: (164.66, 75.66) [M::mem_pestat] low and high boundaries for proper pairs: (1, 616) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (230, 1230, 3683) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10589) [M::mem_pestat] mean and std.dev: (2364.37, 2713.23) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14042) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530106 reads in 277.054 CPU sec, 69.093 real sec [M::process] read 530140 sequences (40000072 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (42, 64543, 795, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (23, 37, 100) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 254) [M::mem_pestat] mean and std.dev: (56.74, 50.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 331) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 159, 243) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 491) [M::mem_pestat] mean and std.dev: (164.69, 75.60) [M::mem_pestat] low and high boundaries for proper pairs: (1, 615) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (298, 1396, 4118) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11758) [M::mem_pestat] mean and std.dev: (2527.79, 2731.27) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15578) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530160 reads in 280.496 CPU sec, 69.921 real sec [M::process] read 530134 sequences (40000148 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (41, 64375, 825, 9) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (24, 55, 196) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 540) [M::mem_pestat] mean and std.dev: (60.33, 71.76) [M::mem_pestat] low and high boundaries for proper pairs: (1, 712) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 159, 244) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 494) [M::mem_pestat] mean and std.dev: (164.62, 76.00) [M::mem_pestat] low and high boundaries for proper pairs: (1, 619) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (304, 1310, 3644) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10324) [M::mem_pestat] mean and std.dev: (2412.87, 2721.83) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13664) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530140 reads in 278.444 CPU sec, 69.410 real sec [M::process] read 530184 sequences (40000079 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (38, 63913, 847, 10) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (33, 66, 265) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 729) [M::mem_pestat] mean and std.dev: (136.60, 176.47) [M::mem_pestat] low and high boundaries for proper pairs: (1, 961) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 159, 247) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 501) [M::mem_pestat] mean and std.dev: (165.28, 76.73) [M::mem_pestat] low and high boundaries for proper pairs: (1, 628) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (345, 1257, 3897) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11001) [M::mem_pestat] mean and std.dev: (2406.67, 2637.57) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14553) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (26, 53, 74) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 170) [M::mem_pestat] mean and std.dev: (43.00, 25.85) [M::mem_pestat] low and high boundaries for proper pairs: (1, 218) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 530134 reads in 280.308 CPU sec, 69.873 real sec [M::process] read 530122 sequences (40000145 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (49, 63751, 793, 3) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (24, 52, 102) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 258) [M::mem_pestat] mean and std.dev: (50.88, 45.00) [M::mem_pestat] low and high boundaries for proper pairs: (1, 336) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 159, 242) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 488) [M::mem_pestat] mean and std.dev: (164.33, 74.75) [M::mem_pestat] low and high boundaries for proper pairs: (1, 611) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (241, 1115, 3539) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10135) [M::mem_pestat] mean and std.dev: (2251.34, 2572.68) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13433) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530184 reads in 280.784 CPU sec, 69.946 real sec [M::process] read 530102 sequences (40000054 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (70, 64657, 824, 7) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (33, 67, 166) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 432) [M::mem_pestat] mean and std.dev: (72.28, 72.31) [M::mem_pestat] low and high boundaries for proper pairs: (1, 565) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 158, 241) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 485) [M::mem_pestat] mean and std.dev: (163.71, 74.53) [M::mem_pestat] low and high boundaries for proper pairs: (1, 607) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (286, 1480, 4163) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11917) [M::mem_pestat] mean and std.dev: (2538.78, 2752.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15794) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530122 reads in 276.902 CPU sec, 68.968 real sec [M::process] read 530162 sequences (40000012 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (50, 63238, 869, 11) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (16, 40, 91) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 241) [M::mem_pestat] mean and std.dev: (53.93, 55.13) [M::mem_pestat] low and high boundaries for proper pairs: (1, 316) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 158, 242) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 488) [M::mem_pestat] mean and std.dev: (163.86, 74.80) [M::mem_pestat] low and high boundaries for proper pairs: (1, 611) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (439, 1588, 4019) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11179) [M::mem_pestat] mean and std.dev: (2588.16, 2692.36) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14759) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (57, 87, 133) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 285) [M::mem_pestat] mean and std.dev: (73.44, 32.18) [M::mem_pestat] low and high boundaries for proper pairs: (1, 361) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 530102 reads in 279.424 CPU sec, 69.629 real sec [M::process] read 467662 sequences (35286106 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (49, 64089, 860, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (26, 62, 491) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1421) [M::mem_pestat] mean and std.dev: (187.30, 284.63) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1886) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 157, 240) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 482) [M::mem_pestat] mean and std.dev: (163.86, 74.97) [M::mem_pestat] low and high boundaries for proper pairs: (1, 603) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (368, 1357, 3720) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10424) [M::mem_pestat] mean and std.dev: (2393.50, 2615.68) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13776) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530162 reads in 277.560 CPU sec, 69.125 real sec Sorting block time: 00:11:49 Returning block of 508405856 [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (39, 56960, 725, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (14, 39, 89) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 239) [M::mem_pestat] mean and std.dev: (37.62, 30.48) [M::mem_pestat] low and high boundaries for proper pairs: (1, 314) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 157, 241) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 485) [M::mem_pestat] mean and std.dev: (163.42, 74.37) [M::mem_pestat] low and high boundaries for proper pairs: (1, 607) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (372, 1324, 3764) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10548) [M::mem_pestat] mean and std.dev: (2383.88, 2594.16) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13940) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 467662 reads in 245.767 CPU sec, 61.398 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S2/processings/unmapped_reads/unmapped_1.fastq.gz /dataS2/processings/unmapped_reads/unmapped_2.fastq.gz [main] Real time: 920.031 sec; CPU: 3613.244 sec CIRCexplorer2 parse -b samples/S2/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWA samples/S2/processings/circRNAs/bwa_out/S2_bwa.sam.gz | tee samplcessings/circRNAs/CIRCexplorer2_bwa/CIRCexplorer2_bwa.log CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 parse -b samples/S2/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWAS2/processings/circRNAs/bwa_out/S2_bwa.sam.gz Start CIRCexplorer2 parse at 17:01:33 Start parsing fusion junctions from BWA... Converted 62984 fusion reads! End CIRCexplorer2 parse at 17:02:07 zcat samples/S2/processings/circRNAs/bwa_out/S2_bwa.sam.gz | samtools view -F 4 - | cut -f 1 | sort | uniq | wc -l > samples/S2/processings/circRNAs/bwa_out/BWA_mapped_reatxt cd samples/S2/processings/circRNAs/CIRCexplorer2_bwa/annotate && CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRCrimary_assembly.fa -b /data/samples/S2/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt | tee CIRCexplorer2_bwa_annotate.log && cd CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRprimary_assembly.fa -b /data/samples/S2/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt Start CIRCexplorer2 annotate at 17:02:25 Start to annotate fusion junctions... Annotated 5684 fusion junctions! Start to fix fusion junctions... Fixed 3367 fusion junctions! End CIRCexplorer2 annotate at 17:02:38 bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S3/processings/unmapped_reads/unmapped_1.fastq.gz /data/samples/S3/gs/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S3/processings/circRNAs/bwa_out/S3_bwa.sam.gz [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 530176 sequences (40000019 bp)... [M::process] read 530244 sequences (40000022 bp)... Getting block 3 of 7 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (44, 73736, 907, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (22, 55, 119) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 313) [M::mem_pestat] mean and std.dev: (47.89, 35.55) [M::mem_pestat] low and high boundaries for proper pairs: (1, 410) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 169, 372) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 876) [M::mem_pestat] mean and std.dev: (190.33, 139.05) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1128) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (398, 1588, 4104) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11516) [M::mem_pestat] mean and std.dev: (2587.07, 2663.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15222) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:41 Sorting block of length 208215182 (Using difference cover) [M::mem_process_seqs] Processed 530176 reads in 311.786 CPU sec, 77.873 real sec [M::process] read 530226 sequences (40000020 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (46, 64689, 969, 11) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (27, 54, 102) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 252) [M::mem_pestat] mean and std.dev: (56.02, 47.90) [M::mem_pestat] low and high boundaries for proper pairs: (1, 327) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 161, 267) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 563) [M::mem_pestat] mean and std.dev: (167.44, 83.96) [M::mem_pestat] low and high boundaries for proper pairs: (1, 711) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (296, 1335, 4163) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11897) [M::mem_pestat] mean and std.dev: (2524.79, 2757.80) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15764) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (30, 42, 588) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1704) [M::mem_pestat] mean and std.dev: (179.70, 302.72) [M::mem_pestat] low and high boundaries for proper pairs: (1, 2262) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 530244 reads in 289.791 CPU sec, 72.321 real sec [M::process] read 530230 sequences (40000098 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (62, 64121, 967, 13) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (21, 47, 118) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 312) [M::mem_pestat] mean and std.dev: (48.98, 46.55) [M::mem_pestat] low and high boundaries for proper pairs: (1, 409) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 161, 259) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 539) [M::mem_pestat] mean and std.dev: (165.81, 79.91) [M::mem_pestat] low and high boundaries for proper pairs: (1, 679) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (400, 1563, 3843) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10729) [M::mem_pestat] mean and std.dev: (2555.46, 2687.74) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14172) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (18, 44, 527) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1545) [M::mem_pestat] mean and std.dev: (128.00, 215.67) [M::mem_pestat] low and high boundaries for proper pairs: (1, 2054) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 530226 reads in 288.374 CPU sec, 71.861 real sec [M::process] read 530222 sequences (40000134 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (42, 62728, 1035, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (14, 72, 364) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1064) [M::mem_pestat] mean and std.dev: (124.24, 159.16) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1414) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (118, 159, 252) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 520) [M::mem_pestat] mean and std.dev: (164.69, 78.33) [M::mem_pestat] low and high boundaries for proper pairs: (1, 654) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (401, 1606, 4104) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11510) [M::mem_pestat] mean and std.dev: (2649.34, 2726.99) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15213) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530230 reads in 290.421 CPU sec, 72.434 real sec [M::process] read 530218 sequences (40000005 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (40, 62679, 1026, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (30, 68, 139) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 357) [M::mem_pestat] mean and std.dev: (53.90, 37.57) [M::mem_pestat] low and high boundaries for proper pairs: (1, 466) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 160, 252) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 518) [M::mem_pestat] mean and std.dev: (165.38, 78.12) [M::mem_pestat] low and high boundaries for proper pairs: (1, 651) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (408, 1608, 4066) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11382) [M::mem_pestat] mean and std.dev: (2542.61, 2639.44) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15040) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530222 reads in 288.491 CPU sec, 71.980 real sec Sorting block time: 00:04:59 Returning block of 208215183 [M::process] read 530180 sequences (40000092 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (51, 62745, 1027, 10) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (20, 43, 126) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 338) [M::mem_pestat] mean and std.dev: (56.16, 49.03) [M::mem_pestat] low and high boundaries for proper pairs: (1, 444) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (118, 159, 248) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 508) [M::mem_pestat] mean and std.dev: (164.11, 76.43) [M::mem_pestat] low and high boundaries for proper pairs: (1, 638) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (437, 1786, 4075) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11351) [M::mem_pestat] mean and std.dev: (2590.58, 2567.42) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14989) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (29, 109, 724) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2114) [M::mem_pestat] mean and std.dev: (282.00, 372.61) [M::mem_pestat] low and high boundaries for proper pairs: (1, 2809) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR Getting block 4 of 7 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: [M::mem_process_seqs] Processed 530218 reads in 287.641 CPU sec, 71.691 real sec 10% 20% 30% [M::process] read 530204 sequences (40000022 bp)... 40% 50% 60% 70% 80% 90% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (56, 63186, 1028, 3) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (19, 48, 93) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 241) [M::mem_pestat] mean and std.dev: (46.54, 36.56) [M::mem_pestat] low and high boundaries for proper pairs: (1, 315) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 160, 251) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 515) [M::mem_pestat] mean and std.dev: (164.97, 77.49) [M::mem_pestat] low and high boundaries for proper pairs: (1, 647) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (385, 1598, 4185) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11785) [M::mem_pestat] mean and std.dev: (2575.94, 2644.94) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15585) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF 100% Block accumulator loop time: 00:00:49 Sorting block of length 342801845 (Using difference cover) [M::mem_process_seqs] Processed 530180 reads in 282.186 CPU sec, 70.290 real sec [M::process] read 530282 sequences (40000064 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (55, 62748, 1025, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (31, 53, 103) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 247) [M::mem_pestat] mean and std.dev: (54.43, 38.76) [M::mem_pestat] low and high boundaries for proper pairs: (1, 319) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 159, 249) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 509) [M::mem_pestat] mean and std.dev: (164.68, 77.06) [M::mem_pestat] low and high boundaries for proper pairs: (1, 639) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (399, 1558, 3897) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10893) [M::mem_pestat] mean and std.dev: (2494.74, 2573.98) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14391) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530204 reads in 288.098 CPU sec, 71.774 real sec [M::process] read 530232 sequences (40000048 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (72, 62263, 1065, 8) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (34, 60, 142) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 358) [M::mem_pestat] mean and std.dev: (67.31, 57.32) [M::mem_pestat] low and high boundaries for proper pairs: (1, 466) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (118, 158, 248) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 508) [M::mem_pestat] mean and std.dev: (163.95, 76.79) [M::mem_pestat] low and high boundaries for proper pairs: (1, 638) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (456, 1777, 4210) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11718) [M::mem_pestat] mean and std.dev: (2682.92, 2660.42) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15472) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530282 reads in 290.984 CPU sec, 72.496 real sec [M::process] read 530202 sequences (40000039 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (52, 63319, 1051, 7) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (25, 52, 148) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 394) [M::mem_pestat] mean and std.dev: (59.53, 59.51) [M::mem_pestat] low and high boundaries for proper pairs: (1, 517) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 159, 250) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 512) [M::mem_pestat] mean and std.dev: (164.93, 77.24) [M::mem_pestat] low and high boundaries for proper pairs: (1, 643) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (356, 1546, 4054) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11450) [M::mem_pestat] mean and std.dev: (2557.01, 2655.67) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15148) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530232 reads in 287.725 CPU sec, 71.821 real sec [M::process] read 530254 sequences (40000081 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (55, 62154, 1044, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (17, 42, 166) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 464) [M::mem_pestat] mean and std.dev: (68.85, 91.70) [M::mem_pestat] low and high boundaries for proper pairs: (1, 613) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (118, 158, 246) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 502) [M::mem_pestat] mean and std.dev: (163.57, 76.08) [M::mem_pestat] low and high boundaries for proper pairs: (1, 630) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (378, 1643, 4104) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11556) [M::mem_pestat] mean and std.dev: (2611.94, 2703.98) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15282) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530202 reads in 289.856 CPU sec, 72.253 real sec [M::process] read 372372 sequences (28088445 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (61, 62493, 1031, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (28, 56, 290) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 814) [M::mem_pestat] mean and std.dev: (93.50, 142.13) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1076) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 159, 247) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 503) [M::mem_pestat] mean and std.dev: (163.93, 75.81) [M::mem_pestat] low and high boundaries for proper pairs: (1, 631) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (398, 1572, 3849) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10751) [M::mem_pestat] mean and std.dev: (2554.21, 2640.22) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14202) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530254 reads in 289.500 CPU sec, 72.227 real sec [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (30, 44682, 703, 4) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (28, 82, 395) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1129) [M::mem_pestat] mean and std.dev: (147.96, 221.32) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1496) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (118, 158, 250) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 514) [M::mem_pestat] mean and std.dev: (164.80, 77.81) [M::mem_pestat] low and high boundaries for proper pairs: (1, 646) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (424, 1725, 4193) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11731) [M::mem_pestat] mean and std.dev: (2625.17, 2627.73) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15500) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 372372 reads in 203.326 CPU sec, 50.832 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S3/processings/unmapped_reads/unmapped_1.fastq.gz /dataS3/processings/unmapped_reads/unmapped_2.fastq.gz [main] Real time: 932.427 sec; CPU: 3692.121 sec CIRCexplorer2 parse -b samples/S3/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWA samples/S3/processings/circRNAs/bwa_out/S3_bwa.sam.gz | tee samplcessings/circRNAs/CIRCexplorer2_bwa/CIRCexplorer2_bwa.log CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 parse -b samples/S3/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWAS3/processings/circRNAs/bwa_out/S3_bwa.sam.gz Start CIRCexplorer2 parse at 17:18:12 Start parsing fusion junctions from BWA... Converted 68255 fusion reads! End CIRCexplorer2 parse at 17:18:46 zcat samples/S3/processings/circRNAs/bwa_out/S3_bwa.sam.gz | samtools view -F 4 - | cut -f 1 | sort | uniq | wc -l > samples/S3/processings/circRNAs/bwa_out/BWA_mapped_reatxt Sorting block time: 00:08:04 Returning block of 342801846 cd samples/S3/processings/circRNAs/CIRCexplorer2_bwa/annotate && CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRCrimary_assembly.fa -b /data/samples/S3/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt | tee CIRCexplorer2_bwa_annotate.log && cd CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRprimary_assembly.fa -b /data/samples/S3/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt Start CIRCexplorer2 annotate at 17:19:04 Start to annotate fusion junctions... Annotated 6673 fusion junctions! Start to fix fusion junctions... Fixed 3941 fusion junctions! End CIRCexplorer2 annotate at 17:19:17 bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S4/processings/unmapped_reads/unmapped_1.fastq.gz /data/samples/S4/gs/unmapped_reads/unmapped_2.fastq.gz | gzip -c > samples/S4/processings/circRNAs/bwa_out/S4_bwa.sam.gz [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 530010 sequences (40000082 bp)... [M::process] read 529984 sequences (40000009 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (28, 53342, 642, 11) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (23, 84, 904) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2666) [M::mem_pestat] mean and std.dev: (236.54, 325.99) [M::mem_pestat] low and high boundaries for proper pairs: (1, 3547) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (121, 175, 473) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1177) [M::mem_pestat] mean and std.dev: (218.11, 199.59) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1529) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (472, 1500, 4111) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11389) [M::mem_pestat] mean and std.dev: (2613.32, 2685.49) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15028) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (51, 62, 95) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 183) [M::mem_pestat] mean and std.dev: (53.67, 28.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 227) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR Getting block 5 of 7 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% [M::mem_process_seqs] Processed 530010 reads in 412.471 CPU sec, 103.062 real sec 70% 80% 90% [M::process] read 529990 sequences (40000019 bp)... 100% Block accumulator loop time: 00:00:50 Sorting block of length 485518143 (Using difference cover) [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (29, 46679, 717, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (26, 49, 851) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2501) [M::mem_pestat] mean and std.dev: (240.23, 384.23) [M::mem_pestat] low and high boundaries for proper pairs: (1, 3326) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 165, 287) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 621) [M::mem_pestat] mean and std.dev: (172.53, 91.79) [M::mem_pestat] low and high boundaries for proper pairs: (1, 788) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (404, 1594, 3818) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10646) [M::mem_pestat] mean and std.dev: (2532.44, 2596.86) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14060) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 529984 reads in 332.028 CPU sec, 82.777 real sec [M::process] read 530006 sequences (40000099 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (24, 45167, 685, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (40, 64, 170) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 430) [M::mem_pestat] mean and std.dev: (69.85, 65.00) [M::mem_pestat] low and high boundaries for proper pairs: (1, 560) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 163, 271) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 573) [M::mem_pestat] mean and std.dev: (169.55, 85.75) [M::mem_pestat] low and high boundaries for proper pairs: (1, 724) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (396, 1655, 4085) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11463) [M::mem_pestat] mean and std.dev: (2516.58, 2541.91) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15152) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 529990 reads in 330.544 CPU sec, 82.435 real sec [M::process] read 530022 sequences (40000026 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (30, 45193, 667, 9) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (15, 56, 114) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 312) [M::mem_pestat] mean and std.dev: (51.64, 46.88) [M::mem_pestat] low and high boundaries for proper pairs: (1, 411) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 163, 266) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 558) [M::mem_pestat] mean and std.dev: (169.09, 83.71) [M::mem_pestat] low and high boundaries for proper pairs: (1, 704) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (401, 1645, 4380) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 12338) [M::mem_pestat] mean and std.dev: (2713.92, 2762.50) [M::mem_pestat] low and high boundaries for proper pairs: (1, 16317) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530006 reads in 329.639 CPU sec, 82.221 real sec [M::process] read 529982 sequences (40000101 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (33, 44255, 721, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (20, 48, 235) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 665) [M::mem_pestat] mean and std.dev: (78.07, 105.57) [M::mem_pestat] low and high boundaries for proper pairs: (1, 880) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 162, 262) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 548) [M::mem_pestat] mean and std.dev: (168.60, 83.13) [M::mem_pestat] low and high boundaries for proper pairs: (1, 691) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (407, 1447, 3771) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10499) [M::mem_pestat] mean and std.dev: (2494.61, 2632.05) [M::mem_pestat] low and high boundaries for proper pairs: (1, 13863) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530022 reads in 328.609 CPU sec, 82.001 real sec [M::process] read 529986 sequences (40000133 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (35, 43149, 740, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (23, 46, 101) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 257) [M::mem_pestat] mean and std.dev: (54.17, 46.18) [M::mem_pestat] low and high boundaries for proper pairs: (1, 335) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (121, 162, 261) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 541) [M::mem_pestat] mean and std.dev: (168.87, 81.87) [M::mem_pestat] low and high boundaries for proper pairs: (1, 681) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (404, 1577, 4111) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11525) [M::mem_pestat] mean and std.dev: (2669.43, 2756.29) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15232) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 529982 reads in 327.736 CPU sec, 81.768 real sec [M::process] read 530020 sequences (40000052 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (29, 44376, 696, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (27, 42, 108) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 270) [M::mem_pestat] mean and std.dev: (59.96, 47.05) [M::mem_pestat] low and high boundaries for proper pairs: (1, 351) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 161, 257) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 531) [M::mem_pestat] mean and std.dev: (167.65, 80.42) [M::mem_pestat] low and high boundaries for proper pairs: (1, 668) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (476, 1659, 4018) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11102) [M::mem_pestat] mean and std.dev: (2622.39, 2665.85) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14644) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 529986 reads in 326.433 CPU sec, 81.389 real sec [M::process] read 530014 sequences (40000059 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (37, 43855, 724, 4) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (16, 36, 183) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 517) [M::mem_pestat] mean and std.dev: (102.26, 145.30) [M::mem_pestat] low and high boundaries for proper pairs: (1, 684) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 161, 258) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 534) [M::mem_pestat] mean and std.dev: (167.77, 80.58) [M::mem_pestat] low and high boundaries for proper pairs: (1, 672) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (370, 1444, 4023) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11329) [M::mem_pestat] mean and std.dev: (2597.54, 2820.81) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14982) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530020 reads in 325.896 CPU sec, 81.271 real sec [M::process] read 529986 sequences (40000052 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (19, 44153, 694, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (45, 64, 125) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 285) [M::mem_pestat] mean and std.dev: (54.00, 27.99) [M::mem_pestat] low and high boundaries for proper pairs: (1, 365) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 161, 260) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 540) [M::mem_pestat] mean and std.dev: (168.04, 81.60) [M::mem_pestat] low and high boundaries for proper pairs: (1, 680) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (337, 1502, 3909) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11053) [M::mem_pestat] mean and std.dev: (2461.57, 2604.61) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14625) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530014 reads in 329.731 CPU sec, 82.232 real sec [M::process] read 529996 sequences (40000132 bp)... Sorting block time: 00:11:22 Returning block of 485518144 [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (33, 43947, 699, 9) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (34, 52, 210) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 562) [M::mem_pestat] mean and std.dev: (74.18, 89.20) [M::mem_pestat] low and high boundaries for proper pairs: (1, 738) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 162, 256) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 528) [M::mem_pestat] mean and std.dev: (167.91, 79.93) [M::mem_pestat] low and high boundaries for proper pairs: (1, 664) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (513, 1603, 4089) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11241) [M::mem_pestat] mean and std.dev: (2629.69, 2669.56) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14817) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 529986 reads in 325.503 CPU sec, 81.205 real sec [M::process] read 530024 sequences (40000073 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (29, 43539, 688, 10) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (23, 56, 113) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 293) [M::mem_pestat] mean and std.dev: (66.33, 57.46) [M::mem_pestat] low and high boundaries for proper pairs: (1, 383) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 162, 260) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 540) [M::mem_pestat] mean and std.dev: (168.54, 81.75) [M::mem_pestat] low and high boundaries for proper pairs: (1, 680) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (437, 1649, 4042) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11252) [M::mem_pestat] mean and std.dev: (2564.61, 2652.32) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14857) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (44, 106, 158) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 386) [M::mem_pestat] mean and std.dev: (85.33, 53.16) [M::mem_pestat] low and high boundaries for proper pairs: (1, 500) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 529996 reads in 327.780 CPU sec, 81.764 real sec [M::process] read 529998 sequences (40000055 bp)... Getting block 6 of 7 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (26, 43630, 760, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (30, 70, 178) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 474) [M::mem_pestat] mean and std.dev: (70.67, 73.56) [M::mem_pestat] low and high boundaries for proper pairs: (1, 622) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (121, 162, 259) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 535) [M::mem_pestat] mean and std.dev: (168.83, 80.92) [M::mem_pestat] low and high boundaries for proper pairs: (1, 673) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (418, 1514, 4004) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11176) [M::mem_pestat] mean and std.dev: (2543.38, 2705.81) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14762) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:43 Sorting block of length 406785588 (Using difference cover) [M::mem_process_seqs] Processed 530024 reads in 322.435 CPU sec, 80.437 real sec [M::process] read 529978 sequences (40000045 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (26, 44405, 734, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (21, 56, 287) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 819) [M::mem_pestat] mean and std.dev: (93.68, 142.71) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1085) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 161, 256) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 528) [M::mem_pestat] mean and std.dev: (167.78, 80.39) [M::mem_pestat] low and high boundaries for proper pairs: (1, 664) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (435, 1897, 4319) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 12087) [M::mem_pestat] mean and std.dev: (2851.03, 2841.13) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15971) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 529998 reads in 323.915 CPU sec, 80.792 real sec [M::process] read 530030 sequences (40000142 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (39, 43182, 744, 5) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (14, 39, 100) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 272) [M::mem_pestat] mean and std.dev: (50.51, 49.47) [M::mem_pestat] low and high boundaries for proper pairs: (1, 358) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 162, 255) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 525) [M::mem_pestat] mean and std.dev: (166.79, 78.78) [M::mem_pestat] low and high boundaries for proper pairs: (1, 660) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (406, 1627, 4199) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11785) [M::mem_pestat] mean and std.dev: (2609.82, 2690.53) [M::mem_pestat] low and high boundaries for proper pairs: (1, 15578) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 529978 reads in 327.709 CPU sec, 81.738 real sec [M::process] read 530020 sequences (40000107 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (27, 43929, 698, 7) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (21, 38, 76) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 186) [M::mem_pestat] mean and std.dev: (43.33, 34.25) [M::mem_pestat] low and high boundaries for proper pairs: (1, 241) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 160, 253) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 519) [M::mem_pestat] mean and std.dev: (166.92, 79.30) [M::mem_pestat] low and high boundaries for proper pairs: (1, 652) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (470, 1610, 4479) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 12497) [M::mem_pestat] mean and std.dev: (2714.91, 2746.82) [M::mem_pestat] low and high boundaries for proper pairs: (1, 16506) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530030 reads in 326.565 CPU sec, 81.421 real sec [M::process] read 371152 sequences (28008936 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (45, 43786, 738, 7) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (24, 45, 185) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 507) [M::mem_pestat] mean and std.dev: (67.14, 67.09) [M::mem_pestat] low and high boundaries for proper pairs: (1, 668) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (120, 162, 257) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 531) [M::mem_pestat] mean and std.dev: (168.15, 80.20) [M::mem_pestat] low and high boundaries for proper pairs: (1, 668) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (336, 1416, 3872) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 10944) [M::mem_pestat] mean and std.dev: (2484.15, 2659.84) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14480) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 530020 reads in 328.330 CPU sec, 81.997 real sec [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (26, 31494, 458, 6) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (35, 89, 593) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1709) [M::mem_pestat] mean and std.dev: (262.29, 329.58) [M::mem_pestat] low and high boundaries for proper pairs: (1, 2267) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (119, 160, 256) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 530) [M::mem_pestat] mean and std.dev: (166.83, 79.82) [M::mem_pestat] low and high boundaries for proper pairs: (1, 667) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (440, 1494, 4039) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 11237) [M::mem_pestat] mean and std.dev: (2480.33, 2526.31) [M::mem_pestat] low and high boundaries for proper pairs: (1, 14836) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 371152 reads in 229.205 CPU sec, 57.240 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -t 4 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S4/processings/unmapped_reads/unmapped_1.fastq.gz /dataS4/processings/unmapped_reads/unmapped_2.fastq.gz [main] Real time: 1398.167 sec; CPU: 5558.592 sec CIRCexplorer2 parse -b samples/S4/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWA samples/S4/processings/circRNAs/bwa_out/S4_bwa.sam.gz | tee samplcessings/circRNAs/CIRCexplorer2_bwa/CIRCexplorer2_bwa.log CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 parse -b samples/S4/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -t BWAS4/processings/circRNAs/bwa_out/S4_bwa.sam.gz Start CIRCexplorer2 parse at 17:42:37 Start parsing fusion junctions from BWA... Converted 69929 fusion reads! End CIRCexplorer2 parse at 17:43:23 zcat samples/S4/processings/circRNAs/bwa_out/S4_bwa.sam.gz | samtools view -F 4 - | cut -f 1 | sort | uniq | wc -l > samples/S4/processings/circRNAs/bwa_out/BWA_mapped_reatxt cd samples/S4/processings/circRNAs/CIRCexplorer2_bwa/annotate && CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRCrimary_assembly.fa -b /data/samples/S4/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt | tee CIRCexplorer2_bwa_annotate.log && cd CIRCexplorer parameters: /circompara2/src/utils/bash/../../../bin/CIRCexplorer2 annotate -r /data/dbs/indexes/Homo_sapiens.GRCh38.104.genePred.wgn -g /data/Homo_sapiens.GRprimary_assembly.fa -b /data/samples/S4/processings/circRNAs/CIRCexplorer2_bwa/back_spliced_junction.bed -o circularRNA_known.txt Start CIRCexplorer2 annotate at 17:43:49 Start to annotate fusion junctions... Annotated 6227 fusion junctions! Start to fix fusion junctions... Fixed 3712 fusion junctions! End CIRCexplorer2 annotate at 17:44:01 CIRCexplorer_compare.R -l S1,S2,S3,S4 -i samples/S1/processings/circRNAs/CIRCexplorer2_bwa/annotate/circularRNA_known.txt,samples/S2/processings/circRNAs/CIRCexplorer2_bwa/circularRNA_known.txt,samples/S3/processings/circRNAs/CIRCexplorer2_bwa/annotate/circularRNA_known.txt,samples/S4/processings/circRNAs/CIRCexplorer2_bwa/annotate/circular.txt -o circular_expression/circrna_collection/merged_samples_circrnas/CIRCexplorer2_bwa_compared.csv cd /data/samples/S1/processings/circRNAs/ciri_out && zcat /data/samples/S1/processings/circRNAs/bwa_out/S1_bwa.sam.gz > S1_bwa.sam.temp && perl /circompara2/src/utils/bash/bin/CIRI.pl -T 4 -I S1_bwa.sam.temp -O S1_ciri.out -F /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa -A /data/Homo_sapiens.GRCh38.104.gtf && rm S1_bwa.sam.temp && cd / [Thu Sep 15 17:44:46 2022] CIRI begins running [Thu Sep 15 17:44:46 2022] Loading reference Sorting block time: 00:09:17 Returning block of 406785589 [Thu Sep 15 17:45:07 2022] Requesting system to split SAM into 4 pieces Getting block 7 of 7 Reserving size (521303816) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:31 Sorting block of length 308980734 (Using difference cover) Divided SAM sizes: ./S1_bwa.sam.tempaa 644848370 ./S1_bwa.sam.tempab 644848370 ./S1_bwa.sam.tempac 644848370 ./S1_bwa.sam.tempad 644848368 SAM was divided successfully. First read of divided SAM files: S1_bwa.sam.tempab: SRR7120609.11930353 S1_bwa.sam.tempac: SRR7120610.11877086 S1_bwa.sam.tempad: SRR7120611.12300507 S1_bwa.sam.tempaa: SRR7120609.1 First reads were recorded successfully. [Thu Sep 15 17:48:16 2022] First scanning Worker 1 begins to scan S1_bwa.sam.tempaa. Worker 2 begins to scan S1_bwa.sam.tempac. Worker 3 begins to scan S1_bwa.sam.tempab. Worker 4 begins to scan S1_bwa.sam.tempad. Worker 1 finished reporting. Worker 2 finished reporting. Worker 3 finished reporting. Worker 4 finished reporting. Candidate reads with splicing signals: 11295 Candidate reads with PEM signals: 11020 Candidate circRNAs found: 7065 [Thu Sep 15 17:49:42 2022] Second scanning Worker 5 begins to scan S1_bwa.sam.tempaa. Worker 6 begins to scan S1_bwa.sam.tempac. Worker 7 begins to scan S1_bwa.sam.tempab. Worker 8 begins to scan S1_bwa.sam.tempad. Worker 5 finished reporting. Worker 6 finished reporting. Worker 7 finished reporting. Worker 8 finished reporting. [Thu Sep 15 17:50:53 2022] Extracting info from temporary files Additional candidate reads found: 1787 Additional candidate reads with PEM signals: 1717 [Thu Sep 15 17:50:56 2022] Summarizing Number of circular RNAs found: 1715 [Thu Sep 15 17:51:00 2022] CIRI finished its work. Please see output file S1_ciri.out for detail. cd /data/samples/S2/processings/circRNAs/ciri_out && zcat /data/samples/S2/processings/circRNAs/bwa_out/S2_bwa.sam.gz > S2_bwa.sam.temp && perl /circompara2/src/utils/bash/bin/CIRI.pl -T 4 -I S2_bwa.sam.temp -O S2_ciri.out -F /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa -A /data/Homo_sapiens.GRCh38.104.gtf && rm S2_bwa.sam.temp && cd / [Thu Sep 15 17:52:17 2022] CIRI begins running [Thu Sep 15 17:52:17 2022] Loading reference [Thu Sep 15 17:53:43 2022] Requesting system to split SAM into 4 pieces Divided SAM sizes: ./S2_bwa.sam.tempaa 575947483 ./S2_bwa.sam.tempab 575947483 ./S2_bwa.sam.tempac 575947483 ./S2_bwa.sam.tempad 575947482 SAM was divided successfully. First read of divided SAM files: S2_bwa.sam.tempab: SRR7120613.9923190 S2_bwa.sam.tempac: SRR7120614.9839058 S2_bwa.sam.tempad: SRR7120615.10221752 S2_bwa.sam.tempaa: SRR7120613.1 First reads were recorded successfully. [Thu Sep 15 17:55:10 2022] First scanning Worker 1 begins to scan S2_bwa.sam.tempad. Worker 2 begins to scan S2_bwa.sam.tempaa. Worker 3 begins to scan S2_bwa.sam.tempac. Worker 4 begins to scan S2_bwa.sam.tempab. Worker 1 finished reporting. Sorting block time: 00:08:05 Returning block of 308980735 Worker 2 finished reporting. Worker 3 finished reporting. Worker 4 finished reporting. Candidate reads with splicing signals: 8954 Candidate reads with PEM signals: 8723 Candidate circRNAs found: 5795 [Thu Sep 15 17:56:06 2022] Second scanning Worker 5 begins to scan S2_bwa.sam.tempad. Worker 6 begins to scan S2_bwa.sam.tempaa. Worker 7 begins to scan S2_bwa.sam.tempac. Worker 8 begins to scan S2_bwa.sam.tempab. Worker 5 finished reporting. Worker 6 finished reporting. Worker 7 finished reporting. Worker 8 finished reporting. [Thu Sep 15 17:57:09 2022] Extracting info from temporary files Additional candidate reads found: 1374 Additional candidate reads with PEM signals: 1292 [Thu Sep 15 17:57:12 2022] Summarizing Number of circular RNAs found: 1335 [Thu Sep 15 17:57:15 2022] CIRI finished its work. Please see output file S2_ciri.out for detail. cd /data/samples/S3/processings/circRNAs/ciri_out && zcat /data/samples/S3/processings/circRNAs/bwa_out/S3_bwa.sam.gz > S3_bwa.sam.temp && perl /circompara2/src/utils/bash/bin/CIRI.pl -T 4 -I S3_bwa.sam.temp -O S3_ciri.out -F /data/Homo_sapiens.GRCh38.dna.primary_assembly.fa -A /data/Homo_sapiens.GRCh38.104.gtf && rm S3_bwa.sam.temp && cd / Exited Ebwt loop fchr[A]: 0 fchr[C]: 819570787 fchr[G]: 1387714628 fchr[T]: 1958169038 fchr[$]: 2780287016 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 798574390 bytes to primary EBWT file: dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly.rev.1.ebwt Wrote 347535884 bytes to secondary EBWT file: dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly.rev.2.ebwt Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 2780287016 bwtLen: 2780287017 sz: 695071754 bwtSz: 695071755 lineRate: 6 linesPerSide: 1 offRate: 5 offMask: 0xffffffe0 isaRate: -1 isaMask: 0xffffffff ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 86883970 offsSz: 347535880 isaLen: 0 isaSz: 0 lineSz: 64 sideSz: 64 sideBwtSz: 56 sideBwtLen: 224 numSidePairs: 6205998 numSides: 12411996 numLines: 12411996 ebwtTotLen: 794367744 ebwtTotSz: 794367744 reverse: 0 Total time for backward call to driver() for mirror index: 01:33:18 [Thu Sep 15 17:57:42 2022] CIRI begins running [Thu Sep 15 17:57:42 2022] Loading reference tophat2 -o samples/S3/processings/circRNAs/tophat_out --fusion-search --keep-fasta-order --no-coverage-search --bowtie1 --GTF /data/Homo_sapiens.GRCh38.104.gtf -p 4 /data/es/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly /data/samples/S3/processings/circRNAs/S3.unmappedSE.fq.gz

[2022-09-15 17:59:06] Beginning TopHat run (v2.1.0)

[2022-09-15 17:59:06] Checking for Bowtie Bowtie version: 1.1.2.0 [2022-09-15 17:59:07] Checking for Bowtie index files (genome).. [2022-09-15 17:59:07] Checking for reference FASTA file [2022-09-15 17:59:07] Generating SAM header for /data/dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly [2022-09-15 17:59:09] Reading known junctions from GTF file [2022-09-15 17:59:30] Preparing reads [Thu Sep 15 17:59:40 2022] Requesting system to split SAM into 4 pieces Divided SAM sizes: ./S3_bwa.sam.tempaa 570069068 ./S3_bwa.sam.tempab 570069068 ./S3_bwa.sam.tempac 570069068 ./S3_bwa.sam.tempad 570069065 SAM was divided successfully. First read of divided SAM files: S3_bwa.sam.tempab: SRR7120617.10185748 S3_bwa.sam.tempac: SRR7120618.10136952 S3_bwa.sam.tempad: SRR7120619.10506075 S3_bwa.sam.tempaa: SRR7120617.2 First reads were recorded successfully. [Thu Sep 15 18:01:17 2022] First scanning Worker 1 begins to scan S3_bwa.sam.tempaa. Worker 2 begins to scan S3_bwa.sam.tempad. Worker 3 begins to scan S3_bwa.sam.tempac. Worker 4 begins to scan S3_bwa.sam.tempab. Worker 1 finished reporting. Worker 2 finished reporting. left reads: min. length=35, max. length=76, 6564177 kept reads (170865 discarded) [2022-09-15 18:02:03] Building transcriptome data files samples/S3/processings/circRNAs/tophat_out/tmp/Homo_sapiens.GRCh38.104 Worker 3 finished reporting. Worker 4 finished reporting. Candidate reads with splicing signals: 12418 Candidate reads with PEM signals: 12121 Candidate circRNAs found: 6914 [Thu Sep 15 18:02:13 2022] Second scanning Worker 5 begins to scan S3_bwa.sam.tempaa. Worker 6 begins to scan S3_bwa.sam.tempad. Worker 7 begins to scan S3_bwa.sam.tempac. Worker 8 begins to scan S3_bwa.sam.tempab. [FAILED] Error: gtf_to_fasta returned an error. scons: *** [samples/S3/processings/circRNAs/tophat_out/accepted_hits.bam] Error 1 Worker 5 finished reporting. Worker 6 finished reporting. Worker 7 finished reporting. Worker 8 finished reporting. [Thu Sep 15 18:03:30 2022] Extracting info from temporary files Additional candidate reads found: 2268 Additional candidate reads with PEM signals: 2186 [Thu Sep 15 18:03:32 2022] Summarizing Number of circular RNAs found: 1873 [Thu Sep 15 18:03:35 2022] CIRI finished its work. Please see output file S3_ciri.out for detail. scons: building terminated because of errors.

egaffo commented 1 year ago

It looks like an Issue of TopHat2 that fails with that annotation file. Could you try using another annotation file? I successfully run with Homo_sapiens.GRCh38.97.gtf.

You could also check if the TopHat2 logs give any helpful message: look at g2f.err and g2f.log into the directory samples/S3/processings/circRNAs/tophat_out/logs

pecoraro90 commented 1 year ago

Hi Enrico, I tried with your gtf but I still had the same error. I'm copying/paste the error + g2f.err + g2f.log

[2022-09-16 16:15:29] Beginning TopHat run (v2.1.0)

[2022-09-16 16:15:29] Checking for Bowtie Bowtie version: 1.1.2.0 [2022-09-16 16:15:29] Checking for Bowtie index files (genome).. [2022-09-16 16:15:29] Checking for reference FASTA file [2022-09-16 16:15:29] Generating SAM header for /data/dbs/indexes/indexes/bowtie/Homo_sapiens.GRCh38.dna.primary_assembly [2022-09-16 16:15:36] Reading known junctions from GTF file [2022-09-16 16:15:57] Preparing reads left reads: min. length=35, max. length=76, 8410919 kept reads (440 279 discarded) [2022-09-16 16:16:37] Building transcriptome data files samples/S4/processings/circRNAs/tophat_out/tmp/Homo_sapiens.GRCh38.97 [FAILED] Error: gtf_to_fasta returned an error. scons: [samples/S4/processings/circRNAs/tophat_out/accepted_hits.bam] Error 1 left reads: min. length=35, max. length=76, 6564177 kept reads (170865 discarded) [2022-09-16 16:17:54] Building transcriptome data files samples/S3/processings/circRNAs/tophat_out/tmp/Homo_sapiens.GRCh38.97 [FAILED] Error: gtf_to_fasta returned an error. scons: [samples/S3/processings/circRNAs/tophat_out/accepted_hits.bam] Error 1 scons: building terminated because of errors.

less g2f.err terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr

less g2f.out Reading the annotation file: /data/Homo_sapiens.GRCh38.97.gtf

egaffo commented 1 year ago

I can't think of other than 1) corrupted GTF file or 2) corrupted TopHat2 index (actually the Bowtie1 index), or 3) out of disk space, or 4) out of RAM, during the run. Did you successfully end the generation of the indexes? For Bowtie1/TopHat2 bowtie-inspect can be of help. Did any other process run concurrently in your machine (perhaps from other users) that could have eaten your machine resources? I guess it is not related to the input reads as it seems to fail while processing the annotation file to build the transcriptome. Can you try with a minimal GTF, f.i. extract the rows of only one gene such as with grep 'gene_name "HIPK3"' Homo_sapiens.GRCh38.97.gtf > hipk3.gtf (will obtain 162 rows). Did CirComPara2 run smoothly with its test files?

pecoraro90 commented 1 year ago

Hi Enrico, I just wanted to let you know that it finally worked. The problem was with the genome.fa file having a scaffold not included in the gtf file. I run the pipeline with CIRCRNA_METHODS = 'circexplorer2_bwa,circexplorer2_tophat,ciri,findcirc' because of memory issues. Thank you for you help! I didn't find the gene_expression_analysis.html file reports summary analysis, but all the other output files seem to be there. I just wanted to ask your opinion about a possible downstream differential expression analysis. Do you think it would be feasible to take the reliable_circexp.csv expression matrix and perform DE in DESeq2 or those kind of data wouldn't be correctly normalized for library size/sequencing depth? Which approach would you suggest?

egaffo commented 1 year ago

I'm glad you worked out the problem, and thank you for sharing the solution. The gene_expression_analysis.html was only in CirComPara 1. I removed the report generation from CirComPara2 as it was too cumbersome to maintain and decided I would decouple the report generation in a separate future package. Anyhow, CirComPara2 saves all the tools' outputs and statistics tables into your project dir. The reliable_circexp.csv expression matrix is definitely what you need to perform circRNA DE analysis, for which I do not recommend DESeq2: check our recent benchmarking work "Systematic benchmarking of statistical methods to assess differential expression of circular RNAs" for more info. If you also need gene DE analysis, I suggest you use tximport (or tximeta) using as input the t_data.ctab files you can find in each sample directory (samples/sampleID/processings/stringtie/ballgown_ctabs). More details here. For DEG analysis DESeq2 is fine. For normalisation of circRNA expression, the approach is (not) debated...I mean, so far, there is no work assessing the best normalisation procedure (even though we explored it a little bit in the preprint linked above). You may normalise using only the circRNA expression matrix (as we do) or according to gene expression data, as proposed by CIRIquant.

pecoraro90 commented 1 year ago

Hi Enrico, I have a further question about the Circompara2 output, more specifically about the reliable_circexp.csv file. I assumed that the coordinates in the first column are the coordinates of the whole length of the detected circRNA. However I calculated the lenght and some of them are very very large (> 70 000 kb) and it seems to me a bit too large for a circRNA, isn't it? If I would to to experimentally validate the expression of that circRNA by PCR I would need to virtually circularize the sequence in the first column of reliable_circexp.csv file (es:chr6:83398366-83407901) and design my primers at the two side of the junction?

egaffo commented 1 year ago

Yes, the circRNA ids are just the genomic coordinates of the backsplice. They do not necessarily correspond to the circRNA length because introns within that range might be spliced out. Therefore, you can have a backsplice spanning 70 Kb in the genome, but a circRNA of only a few hundred bases if short exons are backspliced. You can design PCR divergent primers to cover the backsplice: one pointing toward the start position (covering bases downstream of the start) and the other pointing toward the end position (covering bases upstream of the end).

pecoraro90 commented 1 year ago

Hi Enrico, thanks for the help. I am starting to have the first validation results! I have just one doubt. I noticed in the annotation file circ_to_genes.tsv that the circ_strand is always +. I am assuming this is fictitious, am I wrong? If not, how is this possible? I guess primer construction is not affected by strand specificity of the sequence, given that they are always a Fw and Rv couple.

egaffo commented 1 year ago

Yes, it is always + if you did not use/specified stranded libraries. You may guess the strand by checking the circRNA host gene strand. As you pointed out, the strand is not relevant to primer design.