crimBubble / ECCsplorer

The ECCsplorer is a bioinformatics pipeline for the automated detection of extrachromosomal circular DNA (eccDNA) from paired-end read data of amplified circular DNA.
GNU General Public License v3.0
18 stars 5 forks source link

Does not work for human data: ValueError: all the input array dimensions for the concatenation axis must match exactly #13

Closed ShixiangWang closed 1 year ago

ShixiangWang commented 1 year ago

I tried to use this tool for my cancer pair-end sequencing data from circle DNA enrichment samples. In comparison, circle-map works well. However, as ECCsplorer outputs more robust results and files based on the manuscript description. I would like to use this tool for my major results.

$ cat ./run_eccsplorer_23_0217.sh
#!/bin/bash

# eccsplorer environment

eccsplorer ~/ecc_finder_Results/SNU16-HMW.R1.fastq.gz ~/ecc_finder_Results/SNU16-HMW.R2.fastq.gz -ref /data1/database/human/hg38/genome.fa -out ~/eccsplorer_Results -m map -cpu 10 -tax human

$ ./run_eccsplorer_23_0217.sh
2023-02-21 09:15:21,545 - [setup_logging] INFO: Starting ECCsplorer pipeline with...
Output directory:          /data3/wsx/eccsplorer_Results
Prefix data set A:         TR
File 1 data set A (f1a):   /data3/wsx/ecc_finder_Results/SNU16-HMW.R1.fastq.gz
File 2 data set A (f2a):   /data3/wsx/ecc_finder_Results/SNU16-HMW.R2.fastq.gz
Prefix data set B:         CO
File 1 data set B (f1b):   ---
File 2 data set B (f2b):   ---
Reference genome sequence: /data1/database/human/hg38/genome.fa
Custom BLAST+ database:    None
Taxon:                     human
Read trimming option:      None
Mapping window size:       100
Genome size:               ---
User read count:           not set, using max. available reads
Run mode:                  map
Image format:              png
Max threads used:          10
Logging to file:           No
2023-02-21 09:15:21,546 - [basic_checkups] INFO: Performing basic checkups.
2023-02-21 09:15:21,763 - [r_connection] INFO: Starting Rserve...
Rserv started in daemon mode.
2023-02-21 09:15:24,196 - [r_connection] INFO: Rserv connected (Port 43320)
2023-02-21 09:15:25,540 - [r_connection] INFO: Rserv loading libraries.
2023-02-21 09:15:25,540 - [basic_checkups] INFO: Basic checkups passed successfully.
2023-02-21 09:15:44,088 - [basic_setup] INFO: Creating directory structure.
2023-02-21 09:16:01,214 - [basic_setup] INFO: Chromosome size file found.
2023-02-21 09:16:01,615 - [basic_setup] INFO: RepeatExplorer database location: /data3/wsx/miniconda3/envs/eccsplorer/bin/repex_tarean/databases/dna_database_masked.fasta
2023-02-21 09:16:01,615 - [basic_setup] INFO: BLAST+ database found /data3/wsx/miniconda3/envs/eccsplorer/bin/repex_tarean/databases/dna_database_masked.fasta.
2023-02-21 09:16:03,450 - [basic_setup] INFO: Created nucleotide BLAST (alias) database /data3/wsx/eccsplorer_Results/reference_data/combinedDB.fas with 888 sequences
2023-02-21 09:16:03,451 - [basic_setup] INFO: Using combined BLAST+ databases (combinedDB.fas) containing:
/data3/wsx/miniconda3/envs/eccsplorer/bin/repex_tarean/databases/dna_database_masked.fasta
2023-02-21 09:16:03,451 - [main] INFO: Starting pipeline modules.
2023-02-21 09:16:03,451 - [main] INFO: Starting trimming. This might take a while.
2023-02-21 09:16:03,452 - [main] INFO: Trimming skipped.
Converted SNU16-HMW.R2.fastq.FASTA file found. Skipping converting. (ProcessID 271486)
Converted SNU16-HMW.R1.fastq.FASTA file found. Skipping converting. (ProcessID 271485)
2023-02-21 09:16:03,560 - [converter] INFO: Converted files:
/data3/wsx/eccsplorer_Results/read_data/SNU16-HMW.R1.fastq.fa
/data3/wsx/eccsplorer_Results/read_data/SNU16-HMW.R2.fastq.fa
2023-02-21 09:16:03,665 - [converter] INFO: Converting took 0.21s
2023-02-21 09:16:03,667 - [main] INFO: Creating Rscript for visualization purpose. Might be edited to fit individual needs
2023-02-21 09:16:03,669 - [mapper_coordinator] INFO: Index file for segemehl mapping found.
2023-02-21 09:16:03,671 - [mapper_coordinator] INFO: Chromosome size file found.
2023-02-21 09:16:03,671 - [mapper_coordinator] INFO: Reference genome sequence window file found.
2023-02-21 09:16:03,671 - [mapper_coordinator] INFO: Start: Map reads against reference genome sequence.
2023-02-21 09:16:03,671 - [run_mapping] INFO: Mapping file found. Using existing file.
2023-02-21 09:16:03,672 - [run_mapping] INFO: Checking split read file.
2023-02-21 09:16:04,845 - [run_mapping] INFO: Converting alignment from SAM to BED.
[bam_sort_core] merging from 100 files and 10 in-memory blocks...
2023-02-21 09:33:02,083 - [run_mapping] INFO: Gathering discordant mapping reads.
2023-02-21 09:33:02,083 - [run_mapping] INFO: Searching for reads not mapped in proper pair.
[bam_sort_core] merging from 0 files and 10 in-memory blocks...
2023-02-21 09:38:34,754 - [run_mapping] INFO: Searching for reads mapped in unusual orientation (rev-for, 83).
[bam_sort_core] merging from 20 files and 10 in-memory blocks...
2023-02-21 09:46:26,050 - [run_mapping] INFO: Searching for reads mapped in unusual orientation (rev-for, 163).
[bam_sort_core] merging from 20 files and 10 in-memory blocks...
2023-02-21 09:53:52,143 - [run_mapping] INFO: Removing duplicates from discordant mapping reads.
2023-02-21 10:23:19,820 - [run_mapping] INFO: Finished gathering discordant mapping reads.
2023-02-21 10:23:19,820 - [run_mapping] INFO: Collecting statistics from alignment file (SAM).
2023-02-21 10:31:07,766 - [run_mapping] INFO: Summarizing mapping.
2023-02-21 10:31:07,766 - [run_mapping] INFO: Finished mapping: TR
2023-02-21 10:31:07,766 - [mapper_coordinator] INFO: Finished: Mapped reads against reference genome sequence!
2023-02-21 10:31:07,766 - [mapper_coordinator] INFO: Start: Summarize mapped split reads (SR) and calculate SR regions.
2023-02-21 10:31:07,766 - [run_splitread_detect] INFO: Calculating regions from split reads.
[SEGEMEHL] Tue Feb 21 10:31:07 2023: reading 1 files.
storing trackname "SingleSplit:A1"
[SEGEMEHL] Tue Feb 21 10:31:10 2023: sorting 879704 items.
[SEGEMEHL] Tue Feb 21 10:31:11 2023: summarizing 879704 splits.
2023-02-21 10:31:11,426 - [run_splitread_detect] INFO: Merging and cleaning up regions.
2023-02-21 10:31:11,643 - [mapper_coordinator] INFO: Finished: Summarized mapped SR and calculated SR regions.
2023-02-21 10:31:11,643 - [mapper_coordinator] INFO: Start: Summarize discordant mapped reads (DR) and calculate DR regions.
2023-02-21 10:31:11,643 - [run_discordantread_detect] INFO: Calculating genome coverage from discordant mapping reads.
2023-02-21 10:39:48,633 - [run_discordantread_detect] INFO: Merging and cleaning up regions.
2023-02-21 10:41:20,292 - [mapper_coordinator] INFO: Start: Summarized mapped DR and calculated DR regions.
2023-02-21 10:43:40,123 - [mapper_coordinator] INFO: Calculating rough coverage and find peaks.
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101621; MultiID 01)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101622; MultiID 07)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101619; MultiID 02)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101623; MultiID 03)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101624; MultiID 00)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101625; MultiID 04)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101626; MultiID 06)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101627; MultiID 08)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101628; MultiID 05)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 101633; MultiID 09)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101626)
No peaks in TR_map-all.COVERAGE (ProcessID 101626)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101625)
No peaks in TR_map-all.COVERAGE (ProcessID 101625)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101633)
No peaks in TR_map-all.COVERAGE (ProcessID 101633)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101622)
No peaks in TR_map-all.COVERAGE (ProcessID 101622)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101624)
No peaks in TR_map-all.COVERAGE (ProcessID 101624)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101628)
No peaks in TR_map-all.COVERAGE (ProcessID 101628)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101623)
No peaks in TR_map-all.COVERAGE (ProcessID 101623)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101627)
No peaks in TR_map-all.COVERAGE (ProcessID 101627)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101621)
No peaks in TR_map-all.COVERAGE (ProcessID 101621)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 101619)
Traceback (most recent call last):
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/eccsplorer", line 815, in <module>
    main()
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/eccsplorer", line 775, in main
    sum_mapper_win_coverage, sum_mapper_candidate_fas, analysis_errors = obj_mapper.mapper_coordinator()
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/ECCsplorer/lib/eccMapper.py", line 750, in mapper_coordinator
    self.win_coverage = np.concatenate((self.win_coverage, self.sum_coverages_tr), axis=1)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 30882878 and the array at index 1 has size 3077065
2023-02-21 10:55:07,051 - [r_shutdown] INFO: Shutting down Rserve.
2023-02-21 10:55:07,057 - [exit_err] ERROR: Sorry, something went wrong.
crimBubble commented 1 year ago

Dear @ShixiangWang, did you get the same error in the your first run? It seems like the output here resulted from a re-run of the pipeline and the error might be caused by mismatching files from 2 runs. Can you run the pipeline with a new output directory or delet the old data. If the error persists please let me know.

ShixiangWang commented 1 year ago

Yes. I will test in a new result dir.

ShixiangWang commented 1 year ago
(eccsplorer) [wsx@xu2 code]$ ./run_eccsplorer_23_0217.sh
2023-02-21 15:46:14,905 - [setup_logging] INFO: Starting ECCsplorer pipeline with...
Output directory:          /data3/wsx/eccsplorer_Results2
Prefix data set A:         TR
File 1 data set A (f1a):   /data3/wsx/ecc_finder_Results/SNU16-HMW.R1.fastq.gz
File 2 data set A (f2a):   /data3/wsx/ecc_finder_Results/SNU16-HMW.R2.fastq.gz
Prefix data set B:         CO
File 1 data set B (f1b):   ---
File 2 data set B (f2b):   ---
Reference genome sequence: /data1/database/human/hg38/genome.fa
Custom BLAST+ database:    None
Taxon:                     human
Read trimming option:      None
Mapping window size:       100
Genome size:               ---
User read count:           not set, using max. available reads
Run mode:                  map
Image format:              png
Max threads used:          4
Logging to file:           No
2023-02-21 15:46:14,905 - [basic_checkups] INFO: Performing basic checkups.
2023-02-21 15:46:15,123 - [r_connection] INFO: Starting Rserve...
Rserv started in daemon mode.
2023-02-21 15:46:17,287 - [r_connection] INFO: Rserv connected (Port 33257)
2023-02-21 15:46:18,699 - [r_connection] INFO: Rserv loading libraries.
2023-02-21 15:46:18,699 - [basic_checkups] INFO: Basic checkups passed successfully.
2023-02-21 15:46:36,811 - [basic_setup] INFO: Creating directory structure.
2023-02-21 15:46:40,127 - [basic_setup] INFO: Creating chromosome size file.
2023-02-21 15:46:50,507 - [basic_setup] INFO: RepeatExplorer database location: /data3/wsx/miniconda3/envs/eccsplorer/bin/repex_tarean/databases/dna_database_masked.fasta
2023-02-21 15:46:50,508 - [basic_setup] INFO: BLAST+ database found /data3/wsx/miniconda3/envs/eccsplorer/bin/repex_tarean/databases/dna_database_masked.fasta.
2023-02-21 15:47:00,872 - [basic_setup] INFO: Created nucleotide BLAST (alias) database /data3/wsx/eccsplorer_Results2/reference_data/combinedDB.fas with 888 sequences
2023-02-21 15:47:00,873 - [basic_setup] INFO: Using combined BLAST+ databases (combinedDB.fas) containing:
/data3/wsx/miniconda3/envs/eccsplorer/bin/repex_tarean/databases/dna_database_masked.fasta
2023-02-21 15:47:00,873 - [main] INFO: Starting pipeline modules.
2023-02-21 15:47:00,873 - [main] INFO: Starting trimming. This might take a while.
2023-02-21 15:47:00,873 - [main] INFO: Trimming skipped.
Trying to convert FASTQ to FASTA using "seqtk": SNU16-HMW.R2.fastq (ProcessID 237606)
Trying to convert FASTQ to FASTA using "seqtk": SNU16-HMW.R1.fastq (ProcessID 237604)
Converted SNU16-HMW.R1.fastq.FASTQ to SNU16-HMW.R1.fastq.FASTA using "seqtk" (ProcessID 237604)
Converted SNU16-HMW.R2.fastq.FASTQ to SNU16-HMW.R2.fastq.FASTA using "seqtk" (ProcessID 237606)
2023-02-21 15:53:26,162 - [converter] INFO: Converted files:
/data3/wsx/eccsplorer_Results2/read_data/SNU16-HMW.R1.fastq.fa
/data3/wsx/eccsplorer_Results2/read_data/SNU16-HMW.R2.fastq.fa
2023-02-21 15:53:26,250 - [converter] INFO: Converting took 385.38s
2023-02-21 15:53:26,250 - [main] INFO: Creating Rscript for visualization purpose. Might be edited to fit individual needs
2023-02-21 15:53:26,251 - [mapper_coordinator] INFO: Creating index file for segemehl mapping.
2023-02-21 17:35:10,949 - [mapper_coordinator] INFO: [SEGEMEHL] Tue Feb 21 15:53:26 2023: reading database sequences.
[SEGEMEHL] Tue Feb 21 15:53:59 2023: 25 database sequences found.
[SEGEMEHL] Tue Feb 21 15:53:59 2023: total length of db sequences: 3088286401
[SEGEMEHL] Tue Feb 21 15:53:59 2023: assigning all reads to default read group 'A1'.
[SEGEMEHL] Tue Feb 21 15:53:59 2023: additional read group default values '     SM:sample1      LB:library1     PU:unit1        PL:illumina'
[SEGEMEHL] Tue Feb 21 15:53:59 2023: reads assigned to read group 'A1'
[SEGEMEHL] Tue Feb 21 15:53:59 2023: compiled sam header.
[SEGEMEHL] Tue Feb 21 15:54:10 2023: alphabet of size (7): ACGNT~
[SEGEMEHL] Tue Feb 21 15:54:10 2023: size of db sequence: 3088286426
[SEGEMEHL] Tue Feb 21 15:54:10 2023: constructing suftab.
[SEGEMEHL] Tue Feb 21 15:54:10 2023: alurusort: classify.
init bit array of 386035804
[SEGEMEHL] Tue Feb 21 15:54:30 2023: alurusort: getting bit.
[SEGEMEHL] Tue Feb 21 15:54:30 2023: not bit alurusort: alloc B of size 1376024063.
[SEGEMEHL] Tue Feb 21 15:54:30 2023: alurusort: initbitarray of size 1376024063.
init bit array of 172003008
[SEGEMEHL] Tue Feb 21 15:54:30 2023: alurusort: Qmaxdist in cl of size 3088286426.
[SEGEMEHL] Tue Feb 21 15:54:35 2023: alurusort: scan B.
[SEGEMEHL] Tue Feb 21 15:54:42 2023: alurusort: substringsort.
[SEGEMEHL] Tue Feb 21 15:54:42 2023: setting bit array to zero
[SEGEMEHL] Tue Feb 21 15:54:42 2023: allocating space for buckets and buffers
[SEGEMEHL] Tue Feb 21 15:54:42 2023: memsetting
[SEGEMEHL] Tue Feb 21 15:56:47 2023: substring sort ... ok
[SEGEMEHL] Tue Feb 21 15:56:50 2023: checking valbitarray.
[SEGEMEHL] Tue Feb 21 15:56:50 2023: enter Tprime calculation.
[SEGEMEHL] Tue Feb 21 15:56:50 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 15:57:06 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 15:57:34 2023: tprime: iterating i=3088286426 elements with lenB=1376024063.
[SEGEMEHL] Tue Feb 21 15:57:52 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 15:57:55 2023: enter alursortint.
[SEGEMEHL] Tue Feb 21 15:57:55 2023: alurusortint: classify int.
init bit array of 172003008
[SEGEMEHL] Tue Feb 21 15:58:03 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 15:58:03 2023: alurusortint: init bcktsA.
init bit array of 172003008
[SEGEMEHL] Tue Feb 21 15:58:03 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 15:58:07 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 15:58:07 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 15:58:07 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 15:58:08 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 15:58:08 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 15:58:16 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 15:58:16 2023: countsortint: exiting
init bit array of 68101118
[SEGEMEHL] Tue Feb 21 15:58:16 2023: arrayB: allocating B with 544808941 elements.
[SEGEMEHL] Tue Feb 21 15:58:17 2023: arrayB: iterating to lenA=1376024063.
[SEGEMEHL] Tue Feb 21 15:58:25 2023: arrayB: exiting
init bit array of 172001758
[SEGEMEHL] Tue Feb 21 15:58:39 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 15:58:39 2023: getlistsL: memsetting list of 1376014063 elements.
[SEGEMEHL] Tue Feb 21 15:58:39 2023: getlistsL: iter from 1376024062 down to 0.
[SEGEMEHL] Tue Feb 21 16:00:20 2023: scanning A (1376024063 elems).
[SEGEMEHL] Tue Feb 21 16:00:25 2023: scanning accdist (200004 elems) (1).
[SEGEMEHL] Tue Feb 21 16:00:25 2023: scanning accdist (200004 elems) (2).
[SEGEMEHL] Tue Feb 21 16:00:26 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:00:26 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:00:26 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:00:37 2023: sortlistL: iterating 544808941 elems.
[SEGEMEHL] Tue Feb 21 16:00:50 2023: sortlistL: looping 1376014063 elems.
[SEGEMEHL] Tue Feb 21 16:06:20 2023: sortlistL: iterating 1376024063 elems.
[SEGEMEHL] Tue Feb 21 16:07:25 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:07:28 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:07:28 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:07:34 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:07:48 2023: tprime: iterating i=1376024063 elements with lenB=544808941.
[SEGEMEHL] Tue Feb 21 16:07:53 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:07:55 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:07:55 2023: alurusortint: classify int.
init bit array of 68101118
[SEGEMEHL] Tue Feb 21 16:07:57 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:07:57 2023: alurusortint: init bcktsA.
init bit array of 68101118
[SEGEMEHL] Tue Feb 21 16:07:57 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:07:58 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:07:58 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:07:58 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:08:01 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:08:01 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:08:43 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:08:43 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:08:43 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:08:43 2023: 244462871  300346070       544808941.

init bit array of 30557859
[SEGEMEHL] Tue Feb 21 16:08:43 2023: arrayB: allocating B with 244462871 elements.
[SEGEMEHL] Tue Feb 21 16:08:44 2023: arrayB: iterating to lenA=544808941.
[SEGEMEHL] Tue Feb 21 16:08:53 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:08:53 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:08:53 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:08:57 2023: alurusortint: enter distCount.
init bit array of 68101116
[SEGEMEHL] Tue Feb 21 16:08:58 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:08:58 2023: getlistsS: memsetting list of 544808922 elements.
[SEGEMEHL] Tue Feb 21 16:08:58 2023: getlistsS: iter up to 544808941.
[SEGEMEHL] Tue Feb 21 16:10:29 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:10:37 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:10:37 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:10:37 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:10:38 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:10:38 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:10:41 2023: sortlistS: iterating 244462871 elems.
[SEGEMEHL] Tue Feb 21 16:10:45 2023: sortlistS: looping 544808922 elems.
[SEGEMEHL] Tue Feb 21 16:13:25 2023: sortlistS: iterating 544808941 elems.
[SEGEMEHL] Tue Feb 21 16:14:02 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:14:03 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:14:03 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:14:05 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:14:09 2023: tprime: iterating i=544808941 elements with lenB=244462871.
[SEGEMEHL] Tue Feb 21 16:14:12 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:14:12 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:14:12 2023: alurusortint: classify int.
init bit array of 30557859
[SEGEMEHL] Tue Feb 21 16:14:13 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:14:13 2023: alurusortint: init bcktsA.
init bit array of 30557859
[SEGEMEHL] Tue Feb 21 16:14:13 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:14:14 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:14:14 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:14:14 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:14:18 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:14:18 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:14:53 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:14:54 2023: countsortint: exiting
init bit array of 13617614
[SEGEMEHL] Tue Feb 21 16:14:54 2023: arrayB: allocating B with 108940907 elements.
[SEGEMEHL] Tue Feb 21 16:14:54 2023: arrayB: iterating to lenA=244462871.
[SEGEMEHL] Tue Feb 21 16:14:59 2023: arrayB: exiting
init bit array of 30557859
[SEGEMEHL] Tue Feb 21 16:15:01 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 16:15:01 2023: getlistsL: memsetting list of 244462865 elements.
[SEGEMEHL] Tue Feb 21 16:15:01 2023: getlistsL: iter from 244462870 down to 0.
[SEGEMEHL] Tue Feb 21 16:15:33 2023: scanning A (244462871 elems).
[SEGEMEHL] Tue Feb 21 16:15:38 2023: scanning accdist (78 elems) (1).
[SEGEMEHL] Tue Feb 21 16:15:38 2023: scanning accdist (78 elems) (2).
[SEGEMEHL] Tue Feb 21 16:15:39 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:15:39 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:15:39 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:15:40 2023: sortlistL: iterating 108940907 elems.
[SEGEMEHL] Tue Feb 21 16:15:43 2023: sortlistL: looping 244462865 elems.
[SEGEMEHL] Tue Feb 21 16:16:54 2023: sortlistL: iterating 244462871 elems.
[SEGEMEHL] Tue Feb 21 16:17:10 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:17:10 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:17:10 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:17:11 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:17:13 2023: tprime: iterating i=244462871 elements with lenB=108940907.
[SEGEMEHL] Tue Feb 21 16:17:14 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:17:14 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:17:14 2023: alurusortint: classify int.
init bit array of 13617614
[SEGEMEHL] Tue Feb 21 16:17:15 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:17:15 2023: alurusortint: init bcktsA.
init bit array of 13617614
[SEGEMEHL] Tue Feb 21 16:17:15 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:17:15 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:17:15 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:17:15 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:17:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:17:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:17:32 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:17:32 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:17:32 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:17:32 2023: 48665050   60275857        108940907.

init bit array of 6083132
[SEGEMEHL] Tue Feb 21 16:17:32 2023: arrayB: allocating B with 48665050 elements.
[SEGEMEHL] Tue Feb 21 16:17:32 2023: arrayB: iterating to lenA=108940907.
[SEGEMEHL] Tue Feb 21 16:17:34 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:17:34 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:17:34 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:17:35 2023: alurusortint: enter distCount.
init bit array of 13617613
[SEGEMEHL] Tue Feb 21 16:17:35 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:17:35 2023: getlistsS: memsetting list of 108940904 elements.
[SEGEMEHL] Tue Feb 21 16:17:35 2023: getlistsS: iter up to 108940907.
[SEGEMEHL] Tue Feb 21 16:17:47 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:17:49 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:17:49 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:17:49 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:17:49 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:17:49 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:17:50 2023: sortlistS: iterating 48665050 elems.
[SEGEMEHL] Tue Feb 21 16:17:51 2023: sortlistS: looping 108940904 elems.
[SEGEMEHL] Tue Feb 21 16:18:19 2023: sortlistS: iterating 108940907 elems.
[SEGEMEHL] Tue Feb 21 16:18:25 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:18:26 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:18:26 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:18:26 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:18:27 2023: tprime: iterating i=108940907 elements with lenB=48665050.
[SEGEMEHL] Tue Feb 21 16:18:27 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:18:27 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:18:27 2023: alurusortint: classify int.
init bit array of 6083132
[SEGEMEHL] Tue Feb 21 16:18:28 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:18:28 2023: alurusortint: init bcktsA.
init bit array of 6083132
[SEGEMEHL] Tue Feb 21 16:18:28 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:18:28 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:18:28 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:18:28 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:18:28 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:18:28 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:18:35 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:18:35 2023: countsortint: exiting
init bit array of 2735535
[SEGEMEHL] Tue Feb 21 16:18:35 2023: arrayB: allocating B with 21884276 elements.
[SEGEMEHL] Tue Feb 21 16:18:35 2023: arrayB: iterating to lenA=48665050.
[SEGEMEHL] Tue Feb 21 16:18:36 2023: arrayB: exiting
init bit array of 6083131
[SEGEMEHL] Tue Feb 21 16:18:36 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 16:18:36 2023: getlistsL: memsetting list of 48665046 elements.
[SEGEMEHL] Tue Feb 21 16:18:36 2023: getlistsL: iter from 48665049 down to 0.
[SEGEMEHL] Tue Feb 21 16:18:41 2023: scanning A (48665050 elems).
[SEGEMEHL] Tue Feb 21 16:18:42 2023: scanning accdist (43 elems) (1).
[SEGEMEHL] Tue Feb 21 16:18:42 2023: scanning accdist (43 elems) (2).
[SEGEMEHL] Tue Feb 21 16:18:42 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:18:42 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:18:42 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:18:42 2023: sortlistL: iterating 21884276 elems.
[SEGEMEHL] Tue Feb 21 16:18:43 2023: sortlistL: looping 48665046 elems.
[SEGEMEHL] Tue Feb 21 16:18:54 2023: sortlistL: iterating 48665050 elems.
[SEGEMEHL] Tue Feb 21 16:18:57 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:18:57 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:18:57 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:18:57 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:18:57 2023: tprime: iterating i=48665050 elements with lenB=21884276.
[SEGEMEHL] Tue Feb 21 16:18:58 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:18:58 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:18:58 2023: alurusortint: classify int.
init bit array of 2735535
[SEGEMEHL] Tue Feb 21 16:18:58 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:18:58 2023: alurusortint: init bcktsA.
init bit array of 2735535
[SEGEMEHL] Tue Feb 21 16:18:58 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:18:58 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:18:58 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:18:58 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:18:58 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:18:58 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:01 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:01 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:19:01 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:19:01 2023: 9766114    12118162        21884276.

init bit array of 1220765
[SEGEMEHL] Tue Feb 21 16:19:01 2023: arrayB: allocating B with 9766114 elements.
[SEGEMEHL] Tue Feb 21 16:19:01 2023: arrayB: iterating to lenA=21884276.
[SEGEMEHL] Tue Feb 21 16:19:01 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:19:01 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:19:01 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:19:01 2023: alurusortint: enter distCount.
init bit array of 2735535
[SEGEMEHL] Tue Feb 21 16:19:01 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:19:01 2023: getlistsS: memsetting list of 21884273 elements.
[SEGEMEHL] Tue Feb 21 16:19:01 2023: getlistsS: iter up to 21884276.
[SEGEMEHL] Tue Feb 21 16:19:03 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:19:04 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:19:04 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:19:04 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:19:04 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:19:04 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:19:04 2023: sortlistS: iterating 9766114 elems.
[SEGEMEHL] Tue Feb 21 16:19:04 2023: sortlistS: looping 21884273 elems.
[SEGEMEHL] Tue Feb 21 16:19:09 2023: sortlistS: iterating 21884276 elems.
[SEGEMEHL] Tue Feb 21 16:19:10 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:10 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:10 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:10 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:10 2023: tprime: iterating i=21884276 elements with lenB=9766114.
[SEGEMEHL] Tue Feb 21 16:19:10 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:10 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:10 2023: alurusortint: classify int.
init bit array of 1220765
[SEGEMEHL] Tue Feb 21 16:19:10 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:10 2023: alurusortint: init bcktsA.
init bit array of 1220765
[SEGEMEHL] Tue Feb 21 16:19:10 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:10 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:10 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:10 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:10 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:10 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:11 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:11 2023: countsortint: exiting
init bit array of 547214
[SEGEMEHL] Tue Feb 21 16:19:11 2023: arrayB: allocating B with 4377712 elements.
[SEGEMEHL] Tue Feb 21 16:19:11 2023: arrayB: iterating to lenA=9766114.
[SEGEMEHL] Tue Feb 21 16:19:11 2023: arrayB: exiting
init bit array of 1220765
[SEGEMEHL] Tue Feb 21 16:19:12 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 16:19:12 2023: getlistsL: memsetting list of 9766113 elements.
[SEGEMEHL] Tue Feb 21 16:19:12 2023: getlistsL: iter from 9766113 down to 0.
[SEGEMEHL] Tue Feb 21 16:19:12 2023: scanning A (9766114 elems).
[SEGEMEHL] Tue Feb 21 16:19:12 2023: scanning accdist (18 elems) (1).
[SEGEMEHL] Tue Feb 21 16:19:12 2023: scanning accdist (18 elems) (2).
[SEGEMEHL] Tue Feb 21 16:19:12 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:19:12 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:19:12 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:19:12 2023: sortlistL: iterating 4377712 elems.
[SEGEMEHL] Tue Feb 21 16:19:13 2023: sortlistL: looping 9766113 elems.
[SEGEMEHL] Tue Feb 21 16:19:14 2023: sortlistL: iterating 9766114 elems.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:15 2023: tprime: iterating i=9766114 elements with lenB=4377712.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: classify int.
init bit array of 547214
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: init bcktsA.
init bit array of 547214
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:15 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:15 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:15 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:15 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:15 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:15 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: 1960803    2416909 4377712.

init bit array of 245101
[SEGEMEHL] Tue Feb 21 16:19:15 2023: arrayB: allocating B with 1960803 elements.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: arrayB: iterating to lenA=4377712.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: enter distCount.
init bit array of 547214
[SEGEMEHL] Tue Feb 21 16:19:15 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: getlistsS: memsetting list of 4377711 elements.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: getlistsS: iter up to 4377712.
[SEGEMEHL] Tue Feb 21 16:19:15 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:19:16 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:19:16 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: sortlistS: iterating 1960803 elems.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: sortlistS: looping 4377711 elems.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: sortlistS: iterating 4377712 elems.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:16 2023: tprime: iterating i=4377712 elements with lenB=1960803.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: classify int.
init bit array of 245101
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: init bcktsA.
init bit array of 245101
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:16 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:16 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:16 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:16 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:16 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:16 2023: countsortint: exiting
init bit array of 109381
[SEGEMEHL] Tue Feb 21 16:19:16 2023: arrayB: allocating B with 875042 elements.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: arrayB: iterating to lenA=1960803.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: arrayB: exiting
init bit array of 245101
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: getlistsL: memsetting list of 1960801 elements.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: getlistsL: iter from 1960802 down to 0.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: scanning A (1960803 elems).
[SEGEMEHL] Tue Feb 21 16:19:16 2023: scanning accdist (12 elems) (1).
[SEGEMEHL] Tue Feb 21 16:19:16 2023: scanning accdist (12 elems) (2).
[SEGEMEHL] Tue Feb 21 16:19:16 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:19:16 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:19:16 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:19:16 2023: sortlistL: iterating 875042 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: looping 1960801 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: iterating 1960803 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: iterating i=1960803 elements with lenB=875042.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: classify int.
init bit array of 109381
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: init bcktsA.
init bit array of 109381
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: 392471     482571  875042.

init bit array of 49059
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: allocating B with 392471 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: iterating to lenA=875042.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter distCount.
init bit array of 109381
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: memsetting list of 875041 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: iter up to 875042.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 392471 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: looping 875041 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 875042 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: iterating i=875042 elements with lenB=392471.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: classify int.
init bit array of 49059
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: init bcktsA.
init bit array of 49059
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: exiting
init bit array of 21927
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: allocating B with 175410 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: iterating to lenA=392471.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: exiting
init bit array of 49059
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: memsetting list of 392470 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: iter from 392470 down to 0.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning A (392471 elems).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning accdist (10 elems) (1).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning accdist (10 elems) (2).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: iterating 175410 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: looping 392470 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: iterating 392471 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: iterating i=392471 elements with lenB=175410.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: classify int.
init bit array of 21927
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: init bcktsA.
init bit array of 21927
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: 78836      96574   175410.

init bit array of 9855
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: allocating B with 78836 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: iterating to lenA=175410.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter distCount.
init bit array of 21926
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: memsetting list of 175404 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: iter up to 175410.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 78836 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: looping 175404 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 175410 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: iterating i=175410 elements with lenB=78836.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: classify int.
init bit array of 9855
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: init bcktsA.
init bit array of 9855
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: exiting
init bit array of 4421
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: allocating B with 35363 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: iterating to lenA=78836.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: exiting
init bit array of 9855
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: memsetting list of 78834 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: iter from 78835 down to 0.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning A (78836 elems).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning accdist (9 elems) (1).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning accdist (9 elems) (2).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: iterating 35363 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: looping 78834 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: iterating 78836 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: iterating i=78836 elements with lenB=35363.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: classify int.
init bit array of 4421
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: init bcktsA.
init bit array of 4421
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: 15794      19569   35363.

init bit array of 1975
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: allocating B with 15794 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: iterating to lenA=35363.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter distCount.
init bit array of 4420
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: memsetting list of 35360 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: iter up to 35363.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 15794 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: looping 35360 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 35363 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: iterating i=35363 elements with lenB=15794.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: classify int.
init bit array of 1975
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: init bcktsA.
init bit array of 1975
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: exiting
init bit array of 892
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: allocating B with 7136 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: iterating to lenA=15794.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: exiting
init bit array of 1975
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter get listsL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: memsetting list of 15793 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: iter from 15793 down to 0.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning A (15794 elems).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning accdist (10 elems) (1).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: scanning accdist (10 elems) (2).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsL: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: sort listsL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: allocating stuff
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: iterating 7136 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: looping 15793 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistL: iterating 15794 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsL: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter tprime.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: init arrays.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: scan B
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: iterating i=15794 elements with lenB=7136.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: tprime: exit
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter alurusortint.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: classify int.
init bit array of 892
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: getting bit.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: init bcktsA.
init bit array of 892
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: countingsort.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countingsortint: init buffers and A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: setting buffer to zero
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (1 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (2 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (3 of 3)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: scanning buffer (to set borders)
[SEGEMEHL] Tue Feb 21 16:19:17 2023: countsortint: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: Sorting type S suffixes. Init bcktsB.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: 3171       3965    7136.

init bit array of 397
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: allocating B with 3171 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: iterating to lenA=7136.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: arrayB: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qmaxdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter Qdist.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter distCount.
init bit array of 892
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter get listsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: memsetting list of 7135 elements.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: iter up to 7136.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: scan A
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: set accidst
[SEGEMEHL] Tue Feb 21 16:19:17 2023: getlistsS: exiting
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: freeing stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: enter sortlistsS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: allocating stuff.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 3171 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: looping 7135 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistS: iterating 7136 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: sortlistsS: exiting happily!
[SEGEMEHL] Tue Feb 21 16:19:17 2023: alurusortint: valbitarraysortedS.
init bit array of 892
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 15794 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 7136).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 1975
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 35363 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 15794).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 4421
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 78836 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 35363).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 9855
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 175410 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 78836).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 21927
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 392471 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 175410).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 49059
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 875042 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 392471).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 109381
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 1960803 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 875042).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 245101
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintS.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 4377712 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 1960803).
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: exit.
init bit array of 547214
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: iteration over 9766114 elems.
[SEGEMEHL] Tue Feb 21 16:19:17 2023: reconstruct: scan B (size: 4377712).
[SEGEMEHL] Tue Feb 21 16:19:18 2023: reconstruct: exit.
init bit array of 1220765
[SEGEMEHL] Tue Feb 21 16:19:19 2023: reconstructintS.
[SEGEMEHL] Tue Feb 21 16:19:19 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:19 2023: reconstruct: iteration over 21884276 elems.
[SEGEMEHL] Tue Feb 21 16:19:19 2023: reconstruct: scan B (size: 9766114).
[SEGEMEHL] Tue Feb 21 16:19:19 2023: reconstruct: exit.
init bit array of 2735535
[SEGEMEHL] Tue Feb 21 16:19:23 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:19:23 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:23 2023: reconstruct: iteration over 48665050 elems.
[SEGEMEHL] Tue Feb 21 16:19:23 2023: reconstruct: scan B (size: 21884276).
[SEGEMEHL] Tue Feb 21 16:19:23 2023: reconstruct: exit.
init bit array of 6083132
[SEGEMEHL] Tue Feb 21 16:19:33 2023: reconstructintS.
[SEGEMEHL] Tue Feb 21 16:19:33 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:33 2023: reconstruct: iteration over 108940907 elems.
[SEGEMEHL] Tue Feb 21 16:19:33 2023: reconstruct: scan B (size: 48665050).
[SEGEMEHL] Tue Feb 21 16:19:34 2023: reconstruct: exit.
init bit array of 13617614
[SEGEMEHL] Tue Feb 21 16:19:55 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:19:55 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:19:55 2023: reconstruct: iteration over 244462871 elems.
[SEGEMEHL] Tue Feb 21 16:19:57 2023: reconstruct: scan B (size: 108940907).
[SEGEMEHL] Tue Feb 21 16:19:58 2023: reconstruct: exit.
init bit array of 30557859
[SEGEMEHL] Tue Feb 21 16:20:43 2023: reconstructintS.
[SEGEMEHL] Tue Feb 21 16:20:43 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:20:43 2023: reconstruct: iteration over 544808941 elems.
[SEGEMEHL] Tue Feb 21 16:20:45 2023: reconstruct: scan B (size: 244462871).
[SEGEMEHL] Tue Feb 21 16:20:49 2023: reconstruct: exit.
init bit array of 68101118
[SEGEMEHL] Tue Feb 21 16:21:59 2023: reconstructintL.
[SEGEMEHL] Tue Feb 21 16:21:59 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:21:59 2023: reconstruct: iteration over 1376024063 elems.
[SEGEMEHL] Tue Feb 21 16:22:05 2023: reconstruct: scan B (size: 544808941).
[SEGEMEHL] Tue Feb 21 16:22:14 2023: reconstruct: exit.
init bit array of 172003008
[SEGEMEHL] Tue Feb 21 16:25:26 2023: reconstructcharS.
[SEGEMEHL] Tue Feb 21 16:25:26 2023: reconstruct: init.
[SEGEMEHL] Tue Feb 21 16:25:26 2023: reconstruct: iteration over 3088286426 elems.
[SEGEMEHL] Tue Feb 21 16:25:39 2023: reconstruct: scan B (size: 1376024063).
[SEGEMEHL] Tue Feb 21 16:26:18 2023: reconstruct: exit.
[SEGEMEHL] Tue Feb 21 16:26:19 2023: enter aluruSuffixArrayS start.
init bit array of 386035804
[SEGEMEHL] Tue Feb 21 16:31:51 2023: constructing inv_suftab (3088286426).
[SEGEMEHL] Tue Feb 21 16:33:32 2023: inv_suftab constructed.
[SEGEMEHL] Tue Feb 21 16:33:32 2023: constructing lcp.
[SEGEMEHL] Tue Feb 21 16:41:02 2023: deleting inv_suftab
[SEGEMEHL] Tue Feb 21 16:41:04 2023: constructing child tab.
[SEGEMEHL] Tue Feb 21 16:47:31 2023: constructing suffix links.
[SEGEMEHL] Tue Feb 21 16:47:31 2023: constructing id.
[SEGEMEHL] Tue Feb 21 16:59:30 2023: constructing suflinks - bottom up.
[SEGEMEHL] Tue Feb 21 17:10:14 2023: constructing suflinks - top down.
suflink construction. pushes: 2134815982, maxstack: 34203004
[SEGEMEHL] Tue Feb 21 17:31:38 2023: building  the suffix array has taken 5859.000000 seconds.
[SEGEMEHL] Tue Feb 21 17:31:38 2023: total length of suffix array was 3088286401.
[SEGEMEHL] Tue Feb 21 17:31:38 2023: writing suffix array '/data3/wsx/eccsplorer_Results2/reference_data/genome.idx' to disk.
[SEGEMEHL] Tue Feb 21 17:35:10 2023: Mapping stats:
        total   mapped  (%)     unique  (%)     multi   (%)     split   (%)
all     0       0       -nan%   0       -nan%   0       -nan%   0       -nan%
[SEGEMEHL] Tue Feb 21 17:35:10 2023:
Goodbye.
 "Ich hol' jetzt die Hilti!" (Ein verzweifelter Bauarbeiter)
2023-02-21 17:35:11,044 - [mapper_coordinator] INFO: Chromosome size file found.
2023-02-21 17:35:11,045 - [mapper_coordinator] INFO: Creating Reference genome sequence window file.
2023-02-21 17:36:29,530 - [mapper_coordinator] INFO: Start: Map reads against reference genome sequence.

2023-02-22 03:48:34,959 - [run_mapping] INFO: [SEGEMEHL] Tue Feb 21 17:36:29 2023: reading queries in '/data3/wsx/eccsplorer_Results2/read_data/SNU16-HMW.R1.fastq.fa'.
[SEGEMEHL] Tue Feb 21 17:39:30 2023: 80878327 query sequences found.
[SEGEMEHL] Tue Feb 21 17:39:30 2023: reading mates in '/data3/wsx/eccsplorer_Results2/read_data/SNU16-HMW.R2.fastq.fa'.
[SEGEMEHL] Tue Feb 21 17:42:29 2023: 80878327 mate sequences found.
[SEGEMEHL] Tue Feb 21 17:42:29 2023: reading database sequences.
[SEGEMEHL] Tue Feb 21 17:43:02 2023: 25 database sequences found.
[SEGEMEHL] Tue Feb 21 17:43:02 2023: total length of db sequences: 3088286401
[SEGEMEHL] Tue Feb 21 17:43:02 2023: assigning all reads to default read group 'A1'.
[SEGEMEHL] Tue Feb 21 17:43:02 2023: additional read group default values '     SM:sample1      LB:library1     PU:unit1        PL:illumina'
[SEGEMEHL] Tue Feb 21 17:43:02 2023: reads assigned to read group 'A1'
[SEGEMEHL] Tue Feb 21 17:43:02 2023: compiled sam header.
[SEGEMEHL] Tue Feb 21 17:43:02 2023: reading suffix array '/data3/wsx/eccsplorer_Results2/reference_data/genome.idx' from disk.
[SEGEMEHL] Tue Feb 21 17:44:06 2023: reading lcpc/vtab.
[SEGEMEHL] Tue Feb 21 17:44:27 2023: reading childtab.
[SEGEMEHL] Tue Feb 21 17:45:25 2023: reading suflinks.
[SEGEMEHL] Tue Feb 21 17:46:38 2023: reading lsint id.
[SEGEMEHL] Tue Feb 21 17:46:43 2023: read suffix array '/data3/wsx/eccsplorer_Results2/reference_data/genome.idx' with 3088286426 elements.
[SEGEMEHL] Tue Feb 21 17:46:48 2023: md5 keys of index and db match.
[SEGEMEHL] Tue Feb 21 17:46:48 2023: reading the suffix array has taken 226.000000 seconds.
[SEGEMEHL] Tue Feb 21 17:46:48 2023: opening sam file '/data3/wsx/eccsplorer_Results2/eccpipe_results/mapping_results/TR.sam'.
[SEGEMEHL] Tue Feb 21 17:46:48 2023: writing multi splits to '/data3/wsx/eccsplorer_Results2/eccpipe_results/mapping_results/TR.mult.bed'
[SEGEMEHL] Tue Feb 21 17:46:48 2023: writing sngle splits to '/data3/wsx/eccsplorer_Results2/eccpipe_results/mapping_results/TR.sngl.bed'
[SEGEMEHL] Tue Feb 21 17:46:48 2023: writing trans splits to '/data3/wsx/eccsplorer_Results2/eccpipe_results/mapping_results/TR.trns.txt'
[SEGEMEHL] Tue Feb 21 17:46:48 2023: starting 4 threads.
[SEGEMEHL] Wed Feb 22 03:48:28 2023: threaded matching w/ suffixarray has taken 36100.000000 seconds.
[SEGEMEHL] Wed Feb 22 03:48:34 2023: closing output file '/data3/wsx/eccsplorer_Results2/eccpipe_results/mapping_results/TR.sam'.
[SEGEMEHL] Wed Feb 22 03:48:34 2023: Mapping stats:
        total   mapped  (%)     unique  (%)     multi   (%)     split   (%)
all     161756654       158567819       98.03%  153807210       95.09%  4760609 2.94%   3792391 2.34%
pair    80878327        77362935        95.65%  75029379        92.77%  2333556 2.89%   185420  0.23%
[SEGEMEHL] Wed Feb 22 03:48:34 2023:
Goodbye.
 "Die Ficker!" (Thommy)
2023-02-22 03:48:34,962 - [run_mapping] INFO: Checking split read file.
2023-02-22 03:48:35,794 - [run_mapping] INFO: Converting alignment from SAM to BED.
[bam_sort_core] merging from 104 files and 4 in-memory blocks...
2023-02-22 04:07:17,875 - [run_mapping] INFO: Gathering discordant mapping reads.
2023-02-22 04:07:17,875 - [run_mapping] INFO: Searching for reads not mapped in proper pair.
[bam_sort_core] merging from 8 files and 4 in-memory blocks...
2023-02-22 04:09:44,714 - [run_mapping] INFO: Searching for reads mapped in unusual orientation (rev-for, 83).
[bam_sort_core] merging from 20 files and 4 in-memory blocks...
2023-02-22 04:13:14,469 - [run_mapping] INFO: Searching for reads mapped in unusual orientation (rev-for, 163).
[bam_sort_core] merging from 20 files and 4 in-memory blocks...
2023-02-22 04:16:49,156 - [run_mapping] INFO: Removing duplicates from discordant mapping reads.
2023-02-22 04:46:18,119 - [run_mapping] INFO: Finished gathering discordant mapping reads.
2023-02-22 04:46:18,120 - [run_mapping] INFO: Collecting statistics from alignment file (SAM).
2023-02-22 04:54:05,501 - [run_mapping] INFO: Summarizing mapping.
2023-02-22 04:54:05,502 - [run_mapping] INFO: Finished mapping: TR
2023-02-22 04:54:05,503 - [mapper_coordinator] INFO: Finished: Mapped reads against reference genome sequence!
2023-02-22 04:54:05,503 - [mapper_coordinator] INFO: Start: Summarize mapped split reads (SR) and calculate SR regions.
2023-02-22 04:54:05,503 - [run_splitread_detect] INFO: Calculating regions from split reads.
[SEGEMEHL] Wed Feb 22 04:54:05 2023: reading 1 files.
storing trackname "SingleSplit:A1"
[SEGEMEHL] Wed Feb 22 04:54:08 2023: sorting 879704 items.
[SEGEMEHL] Wed Feb 22 04:54:09 2023: summarizing 879704 splits.
2023-02-22 04:54:09,902 - [run_splitread_detect] INFO: Merging and cleaning up regions.
2023-02-22 04:54:10,091 - [mapper_coordinator] INFO: Finished: Summarized mapped SR and calculated SR regions.
2023-02-22 04:54:10,092 - [mapper_coordinator] INFO: Start: Summarize discordant mapped reads (DR) and calculate DR regions.
2023-02-22 04:54:10,092 - [run_discordantread_detect] INFO: Calculating genome coverage from discordant mapping reads.
2023-02-22 05:03:08,990 - [run_discordantread_detect] INFO: Merging and cleaning up regions.
2023-02-22 05:04:55,171 - [mapper_coordinator] INFO: Start: Summarized mapped DR and calculated DR regions.
2023-02-22 05:06:24,199 - [mapper_coordinator] INFO: Calculating rough coverage and find peaks.
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 66639; MultiID 03)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 66646; MultiID 00)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 66637; MultiID 02)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 66638; MultiID 01)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 66638)
No peaks in TR_map-all.COVERAGE (ProcessID 66638)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 66646)
No peaks in TR_map-all.COVERAGE (ProcessID 66646)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 66639)
No peaks in TR_map-all.COVERAGE (ProcessID 66639)
Calculating peaks in TR_map-all.COVERAGE (ProcessID 66637)
Traceback (most recent call last):
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/eccsplorer", line 815, in <module>
    main()
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/eccsplorer", line 775, in main
    sum_mapper_win_coverage, sum_mapper_candidate_fas, analysis_errors = obj_mapper.mapper_coordinator()
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/ECCsplorer/lib/eccMapper.py", line 750, in mapper_coordinator
    self.win_coverage = np.concatenate((self.win_coverage, self.sum_coverages_tr), axis=1)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 30882878 and the array at index 1 has size 7644028
2023-02-22 06:16:23,480 - [r_shutdown] INFO: Shutting down Rserve.
2023-02-22 06:16:23,481 - [exit_err] ERROR: Sorry, something went wrong.
crimBubble commented 1 year ago

Hi @ShixiangWang,

I am not sure if this is a bug in the pipeline or bedtools on CentOS is maybe behaving different then in Ubuntu. Could you please run the following command with the eccsplorer's conda env active:

bedtools coverage -mean -a /data3/wsx/eccsplorer_Results2/reference_data/x00win100-part.bed -b /data3/wsx/eccsplorer_Results2/eccpipe_results/mapping_results/TR.bed | head -n 50

this should return something like:

chr1    0   100 1.0200000
chr1    100 200 3.6199999
chr1    200 300 3.9900000

If this is the case and the bedtools output looks good please add

print(win_cov_bed[:10])

to line 64 in the eccMapper.py file and rerun the pipeline with the same command as you did before. Please comment the command line output here.

liu-zhiyang commented 1 year ago

Hi @crimBubble, Same problem occurred when I ran ECCsplorer. I followed your instruction to test bedtools coverage first, and the output seems good. For TR.bed:

bedtools coverage -mean -a reference_data/x00win100-part.bed -b eccpipe_results/mapping_results/TR.bed | head
 -n 50
chr1    0       100     0.0000000
chr1    100     200     0.0000000
chr1    200     300     0.0000000
chr1    300     400     0.0000000
chr1    400     500     0.0000000
chr1    500     600     0.0000000
chr1    600     700     0.0000000
chr1    700     800     0.0000000
chr1    800     900     0.0000000
chr1    900     1000    0.0000000
......

And for CO.bed:

bedtools coverage -mean -a reference_data/x00win100-part.bed -b eccpipe_results/mapping_results/CO.bed | head
 -n 50
chr1    0       100     0.0000000
chr1    100     200     0.0000000
chr1    200     300     0.0000000
chr1    300     400     0.0000000
chr1    400     500     0.0000000
chr1    500     600     0.0000000
chr1    600     700     0.0000000
chr1    700     800     0.0000000
chr1    800     900     0.0000000
chr1    900     1000    0.0000000
......

Also, I had tried to print win_coverage, sum_coverage_tr and sum_coverage_co after line 745 in the eccMapper.py file and the output was:

[['chr' 'start' 'end']
 ['chr1' '0' '100']
 ['chr1' '100' '200']
 ...
 ['chrM' '16300' '16400']
 ['chrM' '16400' '16500']
 ['chrM' '16500' '16569']]

[['TR_map-all']
 ['2476.0']
 ['2476.0']
 ...
 ['388.0']
 ['388.0']
 ['388.0']]

[['CO_map-all']
 ['0.0']
 ['0.0']
 ...
 ['38.6300011']
 ['36.7099991']
 ['36.0']]

I found that the first 2 lines of sum_coverage_tr was different with bedtools coverage output. It seems some lines were missed?

Then I tried to add print(win_cov_bed[:10]) to line 64 in the eccMapper.py file and rerun the pipeline. I will update the output after the pipeline complete.

liu-zhiyang commented 1 year ago

@crimBubble, after adding print(win_cov_bed[:10])to line 64 in the eccMapper.py file and rerun the pipeline. The output is:

2023-03-02 11:33:15,491 - [mapper_coordinator] INFO: Calculating rough coverage and find peaks.
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15619; MultiID 13)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15618; MultiID 14)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15617; MultiID 12)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15615; MultiID 11)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15616; MultiID 06)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15620; MultiID 05)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15621; MultiID 03)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15623; MultiID 08)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15622; MultiID 00)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15624; MultiID 09)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15626; MultiID 07)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15625; MultiID 04)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15627; MultiID 02)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15628; MultiID 01)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15630; MultiID 15)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 15629; MultiID 10)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15622)
No peaks in TR_map-all.COVERAGE (ProcessID 15622)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15621)
No peaks in TR_map-all.COVERAGE (ProcessID 15621)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15623)
No peaks in TR_map-all.COVERAGE (ProcessID 15623)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15619)
No peaks in TR_map-all.COVERAGE (ProcessID 15619)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15618)
No peaks in TR_map-all.COVERAGE (ProcessID 15618)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15620)
No peaks in TR_map-all.COVERAGE (ProcessID 15620)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15629)
No peaks in TR_map-all.COVERAGE (ProcessID 15629)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15615)
No peaks in TR_map-all.COVERAGE (ProcessID 15615)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15630)
No peaks in TR_map-all.COVERAGE (ProcessID 15630)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15616)
No peaks in TR_map-all.COVERAGE (ProcessID 15616)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15626)
No peaks in TR_map-all.COVERAGE (ProcessID 15626)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15624)
No peaks in TR_map-all.COVERAGE (ProcessID 15624)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15627)
No peaks in TR_map-all.COVERAGE (ProcessID 15627)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15625)
No peaks in TR_map-all.COVERAGE (ProcessID 15625)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15628)
No peaks in TR_map-all.COVERAGE (ProcessID 15628)
[['chr' 'start' 'end' 'TR_map-all']
 ['chr15' '10673200' '10673300' '0.0000000']
 ['chr15' '10673300' '10673400' '0.0000000']
 ['chr15' '10673400' '10673500' '0.0000000']
 ['chr15' '10673500' '10673600' '0.0000000']
 ['chr15' '10673600' '10673700' '0.0000000']
 ['chr15' '10673700' '10673800' '0.0000000']
 ['chr15' '10673800' '10673900' '0.0000000']
 ['chr15' '10673900' '10674000' '0.0000000']
 ['chr15' '10674000' '10674100' '0.0000000']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 15617)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15650; MultiID 06)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15652; MultiID 12)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15651; MultiID 11)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15653; MultiID 14)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15654; MultiID 13)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15655; MultiID 05)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15656; MultiID 03)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15657; MultiID 00)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15658; MultiID 08)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15659; MultiID 09)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15661; MultiID 07)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15660; MultiID 04)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15662; MultiID 02)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15663; MultiID 01)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15664; MultiID 10)
Calculating mean coverage of CO_map-all.MAPPING (ProcessID 15665; MultiID 15)
[['chr' 'start' 'end' 'CO_map-all']]
[['chr' 'start' 'end' 'CO_map-all']]
[['chr' 'start' 'end' 'CO_map-all']]
[['chr' 'start' 'end' 'CO_map-all']]
[['chr' 'start' 'end' 'CO_map-all']]
[['chr' 'start' 'end' 'CO_map-all']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr13' '40320200' '40320300' '34.3699989']
 ['chr13' '40320300' '40320400' '33.2599983']
 ['chr13' '40320400' '40320500' '33.7900009']
 ['chr13' '40320500' '40320600' '37.8800011']
 ['chr13' '40320600' '40320700' '33.1800003']
 ['chr13' '40320700' '40320800' '31.2800007']
 ['chr13' '40320800' '40320900' '31.0000000']
 ['chr13' '40320900' '40321000' '31.0000000']
 ['chr13' '40321000' '40321100' '31.0000000']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr19' '41637300' '41637400' '36.6199989']
 ['chr19' '41637400' '41637500' '38.0000000']
 ['chr19' '41637500' '41637600' '36.3899994']
 ['chr19' '41637600' '41637700' '36.9700012']
 ['chr19' '41637700' '41637800' '37.7500000']
 ['chr19' '41637800' '41637900' '36.2799988']
 ['chr19' '41637900' '41638000' '36.3499985']
 ['chr19' '41638000' '41638100' '37.0000000']
 ['chr19' '41638100' '41638200' '36.1500015']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr15' '10673200' '10673300' '0.0000000']
 ['chr15' '10673300' '10673400' '0.0000000']
 ['chr15' '10673400' '10673500' '0.0000000']
 ['chr15' '10673500' '10673600' '0.0000000']
 ['chr15' '10673600' '10673700' '0.0000000']
 ['chr15' '10673700' '10673800' '0.0000000']
 ['chr15' '10673800' '10673900' '0.0000000']
 ['chr15' '10673900' '10674000' '0.0000000']
 ['chr15' '10674000' '10674100' '0.0000000']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr5' '83600400' '83600500' '217.0000000']
 ['chr5' '83600500' '83600600' '216.0500031']
 ['chr5' '83600600' '83600700' '215.4199982']
 ['chr5' '83600700' '83600800' '216.6399994']
 ['chr5' '83600800' '83600900' '215.0700073']
 ['chr5' '83600900' '83601000' '215.0000000']
 ['chr5' '83601000' '83601100' '214.0899963']
 ['chr5' '83601100' '83601200' '214.0000000']
 ['chr5' '83601200' '83601300' '214.0000000']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr7' '118397100' '118397200' '63.0000000']
 ['chr7' '118397200' '118397300' '63.0000000']
 ['chr7' '118397300' '118397400' '63.0699997']
 ['chr7' '118397400' '118397500' '64.0400009']
 ['chr7' '118397500' '118397600' '65.0500031']
 ['chr7' '118397600' '118397700' '64.9700012']
 ['chr7' '118397700' '118397800' '65.4700012']
 ['chr7' '118397800' '118397900' '68.2600021']
 ['chr7' '118397900' '118398000' '66.9899979']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr6' '95965800' '95965900' '104.2600021']
 ['chr6' '95965900' '95966000' '104.0000000']
 ['chr6' '95966000' '95966100' '104.4400024']
 ['chr6' '95966100' '95966200' '105.0000000']
 ['chr6' '95966200' '95966300' '104.0599976']
 ['chr6' '95966300' '95966400' '104.0000000']
 ['chr6' '95966400' '95966500' '104.8199997']
 ['chr6' '95966500' '95966600' '104.6699982']
 ['chr6' '95966600' '95966700' '104.5400009']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr4' '80665700' '80665800' '125.0000000']
 ['chr4' '80665800' '80665900' '124.4899979']
 ['chr4' '80665900' '80666000' '125.9899979']
 ['chr4' '80666000' '80666100' '123.4800034']
 ['chr4' '80666100' '80666200' '123.0000000']
 ['chr4' '80666200' '80666300' '123.0000000']
 ['chr4' '80666300' '80666400' '123.0000000']
 ['chr4' '80666400' '80666500' '123.0000000']
 ['chr4' '80666500' '80666600' '123.0000000']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr1' '192914600' '192914700' '162.3999939']
 ['chr1' '192914700' '192914800' '161.4400024']
 ['chr1' '192914800' '192914900' '161.3399963']
 ['chr1' '192914900' '192915000' '163.0000000']
 ['chr1' '192915000' '192915100' '163.4400024']
 ['chr1' '192915100' '192915200' '165.2799988']
 ['chr1' '192915200' '192915300' '165.0000000']
 ['chr1' '192915300' '192915400' '162.3000031']
 ['chr1' '192915400' '192915500' '162.3099976']]
[['chr' 'start' 'end' 'CO_map-all']
 ['chr1' '0' '100' '0.0000000']
 ['chr1' '100' '200' '0.0000000']
 ['chr1' '200' '300' '0.0000000']
 ['chr1' '300' '400' '0.0000000']
 ['chr1' '400' '500' '0.0000000']
 ['chr1' '500' '600' '0.0000000']
 ['chr1' '600' '700' '0.0000000']
 ['chr1' '700' '800' '0.0000000']
 ['chr1' '800' '900' '0.0000000']]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/liuzhy/anaconda3/envs/eccsplorer/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/liuzhy/anaconda3/envs/eccsplorer/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/liuzhy/software/ECCsplorer/lib/eccMapper.py", line 58, in get_rough_coverage
    win_cov_bed = subp_bedtools_to_arr(region, mapping)
  File "/home/liuzhy/software/ECCsplorer/lib/eccMapper.py", line 39, in subp_bedtools_to_arr
    win_cov_bed = win_cov_bed.reshape((win_cov_bed.shape[0] // 4), 4)  # list to table
ValueError: cannot reshape array of size 954609 into shape (238652,4)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "ECCsplorer.py", line 815, in <module>
    main()
  File "ECCsplorer.py", line 775, in main
    sum_mapper_win_coverage, sum_mapper_candidate_fas, analysis_errors = obj_mapper.mapper_coordinator()
  File "/home/liuzhy/software/ECCsplorer/lib/eccMapper.py", line 727, in mapper_coordinator
    tuples_coverages = pool.map(func, window_file_parts)  # part info, array with coverage info
  File "/home/liuzhy/anaconda3/envs/eccsplorer/lib/python3.7/multiprocessing/pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/liuzhy/anaconda3/envs/eccsplorer/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
ValueError: cannot reshape array of size 954609 into shape (238652,4)
2023-03-02 11:57:47,780 - [r_shutdown] INFO: Shutting down Rserve.
2023-03-02 11:57:47,824 - [exit_err] ERROR: Sorry, something went wrong.
crimBubble commented 1 year ago

Hi @liu-zhiyang, it looks like there is only a problem when running with your control data as the TR-part does not raise an error but the same function raises an error with the CO-part. I am really not sure why though as it workes with one of your data sets and not with the other. Could you try to run the -m map mode with just your circSeq data?

liu-zhiyang commented 1 year ago

@crimBubble , I tried to run the pipeline in -m map mode with just circSeq data. It raised the same error. And the output for rough coverage calculating part is:

2023-03-03 15:11:04,907 - [mapper_coordinator] INFO: Calculating rough coverage and find peaks.
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16614; MultiID 08)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16615; MultiID 09)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16608; MultiID 12)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16610; MultiID 13)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16607; MultiID 11)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16606; MultiID 06)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16613; MultiID 00)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16609; MultiID 14)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16616; MultiID 04)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16611; MultiID 05)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16612; MultiID 03)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16617; MultiID 07)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16618; MultiID 02)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16619; MultiID 01)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16620; MultiID 10)
Calculating mean coverage of TR_map-all.MAPPING (ProcessID 16621; MultiID 15)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16614)
No peaks in TR_map-all.COVERAGE (ProcessID 16614)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16613)
No peaks in TR_map-all.COVERAGE (ProcessID 16613)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16617)
No peaks in TR_map-all.COVERAGE (ProcessID 16617)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16608)
No peaks in TR_map-all.COVERAGE (ProcessID 16608)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16621)
No peaks in TR_map-all.COVERAGE (ProcessID 16621)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16615)
No peaks in TR_map-all.COVERAGE (ProcessID 16615)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16610)
No peaks in TR_map-all.COVERAGE (ProcessID 16610)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16619)
No peaks in TR_map-all.COVERAGE (ProcessID 16619)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16611)
No peaks in TR_map-all.COVERAGE (ProcessID 16611)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16609)
No peaks in TR_map-all.COVERAGE (ProcessID 16609)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16616)
No peaks in TR_map-all.COVERAGE (ProcessID 16616)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16618)
No peaks in TR_map-all.COVERAGE (ProcessID 16618)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16606)
No peaks in TR_map-all.COVERAGE (ProcessID 16606)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16607)
No peaks in TR_map-all.COVERAGE (ProcessID 16607)
[['chr' 'start' 'end' 'TR_map-all']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16612)
No peaks in TR_map-all.COVERAGE (ProcessID 16612)
[['chr' 'start' 'end' 'TR_map-all']
 ['chr11' '119231400' '119231500' '1016.0000000']
 ['chr11' '119231500' '119231600' '1014.9899902']
 ['chr11' '119231600' '119231700' '1014.7199707']
 ['chr11' '119231700' '119231800' '1014.0000000']
 ['chr11' '119231800' '119231900' '1014.0000000']
 ['chr11' '119231900' '119232000' '1014.0000000']
 ['chr11' '119232000' '119232100' '1014.0000000']
 ['chr11' '119232100' '119232200' '1014.0599976']
 ['chr11' '119232200' '119232300' '1014.2000122']]
Calculating peaks in TR_map-all.COVERAGE (ProcessID 16620)
Traceback (most recent call last):
  File "ECCsplorer.py", line 815, in <module>
    main()
  File "ECCsplorer.py", line 775, in main
    sum_mapper_win_coverage, sum_mapper_candidate_fas, analysis_errors = obj_mapper.mapper_coordinator()
  File "/home/liuzhy/software/ECCsplorer/lib/eccMapper.py", line 748, in mapper_coordinator
    self.win_coverage = np.concatenate((self.win_coverage, self.sum_coverages_tr), axis=1)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 30882878 and the array at index 1 has size 1894510
2023-03-03 15:27:02,435 - [r_shutdown] INFO: Shutting down Rserve.
2023-03-03 15:27:02,486 - [exit_err] ERROR: Sorry, something went wrong.

It seems that for no peaks processes in calculating peaks, the results which should be in chr start end 0 format for each line were empty and missed in the final result (considering 1894510 seems the number of windows for each process).

ShixiangWang commented 1 year ago

Same observations with tumor circle-seq paired sequencing data.

crimBubble commented 1 year ago

Hi @liu-zhiyang and @ShixiangWang, the peak calculation is not what causes the problem here. No peaks are an expected behavior and the peaks are directly dumped to a file. The problem here rather seems to be that bedtools is not reporting a coverage information at all. In the output you can see that only for the sub-process with ProcessID 16612 the coverage is reported from which the peaks are calculated. The other sub-processes show no bedtools coverage output whatsoever.

As you are both working with human data and its relatively large genome this might be a memory related issue while using many threads. Try to change line 695 in the lib/eccMapper.py file:

from pool = Pool(config.CPU) to pool = Pool(1)

to reduce the threads used for this sub-process.

ShixiangWang commented 1 year ago

@crimBubble Tried and got the following message

Calculating peaks in TR_map-all.COVERAGE (ProcessID 131529)
2023-03-09 00:56:30,771 - [mapper_coordinator] INFO: Coverage calculation and peak finding took 23687.08s.
2023-03-09 00:56:30,807 - [mapper_coordinator] INFO: Extracting eccDNA candidate regions.
2023-03-09 00:58:07,090 - [mapper_coordinator] INFO: Extracting eccDNA candidate regions took 96.27s.
2023-03-09 00:58:07,092 - [mapper_coordinator] INFO: Normalizing coverage data and editing Rscript for visualization.
Traceback (most recent call last):
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/eccsplorer", line 815, in <module>
    main()
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/eccsplorer", line 775, in main
    sum_mapper_win_coverage, sum_mapper_candidate_fas, analysis_errors = obj_mapper.mapper_coordinator()
  File "/data3/wsx/miniconda3/envs/eccsplorer/bin/ECCsplorer/lib/eccMapper.py", line 787, in mapper_coordinator
    self.conn.r.convgenome(self.csv_sum_raw, self.maps_basecnt, self.csv_sum_nrm)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rconn.py", line 369, in __call__
    return self._rconn.callFunc(self.__name__, *args, **kw)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rconn.py", line 78, in decoCheckIfClosed
    return func(self, *args, **kw)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rconn.py", line 261, in callFunc
    self.setRexp(argName, arg)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rconn.py", line 78, in decoCheckIfClosed
    return func(self, *args, **kw)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rconn.py", line 222, in setRexp
    rAssign(name, o, self.sock)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rserializer.py", line 409, in rAssign
    s.serialize(o, dtTypeCode=rtypes.DT_SEXP)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rserializer.py", line 144, in serialize
    length = self.serializeExpr(o)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rserializer.py", line 168, in serializeExpr
    s_func(self, o)
  File "/data3/wsx/miniconda3/envs/eccsplorer/lib/python3.7/site-packages/pyRserve/rserializer.py", line 335, in s_xt_array_numeric
    raise ValueError('Cannot serialize long integer arrays with '
ValueError: Cannot serialize long integer arrays with values outside MAX_INT32 (2**31-1) range
2023-03-09 00:58:08,222 - [r_shutdown] INFO: Shutting down Rserve.
2023-03-09 00:58:08,222 - [exit_err] ERROR: Sorry, something went wrong.
ShixiangWang commented 1 year ago
$ tree ~/eccsplorer_Results/eccpipe_results/mapping_results/
/data3/wsx/eccsplorer_Results/eccpipe_results/mapping_results/
├── candidates
├── CO_regions-all.bed
├── temp
│   ├── report_hiconf_win.tmp
│   ├── TR_map-all_x00win100-part.tmp
│   ├── TR_map-all_x01win100-part.tmp
│   ├── TR_map-all_x02win100-part.tmp
│   └── TR_map-all_x03win100-part.tmp
├── TR_aligned-DR.bed
├── TR_aligned-DR-F163.bed
├── TR_aligned-DR-F83.bed
├── TR_aligned-DR_graph.bed
├── TR_aligned-DR-nF2.bed
├── TR_aligned-SR.bed
├── TR_alignment-stats.txt
├── TR.bed
├── TR_haarz-SR.bed
├── TR_hiconf-ECC-REGIONS.bed
├── TR_lowconf-ECC-regions.bed
├── TR_lowconf-ECC-regions_DR-all.bed
├── TR_lowconf-ECC-regions_SR-all.bed
├── TR_lowconf-ECC-regions_SR-DR.bed
├── TR.mult.bed
├── TR_regions-all.bed
├── TR_regions-DR.bed
├── TR_regions-SR.bed
├── TR.sam
├── TR.sngl.bed
├── TR_summary_coverages.csv
├── TR.trns.txt
└── TR_unmapped.sam
crimBubble commented 1 year ago

This error is connected to the rserv python package. Currently it is only possible to pass int32 values from python to R meaning the numbers in your coverage is too high to pass. Unfortunately, there is nothing I can change about this.

The only advice here is to use less input data. I would not recommend to use more than 1x genome coverage as input data to the pipeline. With high inputs the probability of detecting false positives such as repetitive DNAs increases prominently.

However in your output you can see the files TR_hiconf-ECC-REGIONS.bed and TR_lowconf-ECC-regions.bed. Those contain the positional information of your eccDNA candidates and represent the main results from the mapping part of the pipeline.

ShixiangWang commented 1 year ago

@crimBubble Thanks.

ShixiangWang commented 1 year ago

@crimBubble I have a follow-up question, is it possible to use ECCsplorer to get the circle DNA structure?

crimBubble commented 1 year ago

From the mapping you currently only get the linear sequence. The clustering module gets you a better idea of the circular structure with the clustering graphs, at least with some candidates. I am currently trying different assemblers like the SPAdes assembler to get a better structural view of the eccDNAs. But this is still quite far from being implemented in the pipeline.

ShixiangWang commented 1 year ago

@crimBubble How to run the clustering module with tumor-only Circle-Seq data?

In your README, two paired data are required.

python ECCsplorer.py readsA1.fa/q readsA2.fa/q readsB1.fa/q readsB2.fa/q
crimBubble commented 1 year ago

Well, for the control data you should use data from the exact same sample but without linear DNA removal and amplification of eccDNA prior to sequencing to get the most reliable results. But it is possible to use any WGS sequencing data as control data. But take caution when doing this as you might get false positive results from structural variations or other individual differences.

ShixiangWang commented 1 year ago

Thanks.