crimBubble / ECCsplorer

The ECCsplorer is a bioinformatics pipeline for the automated detection of extrachromosomal circular DNA (eccDNA) from paired-end read data of amplified circular DNA.
GNU General Public License v3.0
18 stars 5 forks source link

ValueError: invalid literal for int() with base 10: b'1.37998e+06\n' #11

Closed liyugo closed 1 year ago

liyugo commented 1 year ago

Hi @crimBubble

An error occurred when I ran the paired-end sequencing Circle-seq using command eccsplorer $circleseq0h_1.fastq.gz $circleseq0h_2.fastq.gz -ref $genome_index --max_threads 10 -out $output_dir --mode map.The error log is as follows.

2023-02-04 02:21:14,185 - [run_discordantread_detect] INFO: Calculating genome coverage from discordant mapping reads.
2023-02-04 02:23:48,674 - [run_discordantread_detect] INFO: Merging and cleaning up regions.
Traceback (most recent call last):
  File "/home/yuguo/anaconda3/envs/repeatexplorer/bin/eccsplorer", line 815, in <module>
    main()
  File "/home/yuguo/anaconda3/envs/repeatexplorer/bin/eccsplorer", line 775, in main
    sum_mapper_win_coverage, sum_mapper_candidate_fas, analysis_errors = obj_mapper.mapper_coordinator()
  File "/home/yuguo/anaconda3/envs/repeatexplorer/bin/ECCsplorer/lib/eccMapper.py", line 647, in mapper_coordinator
    self.run_discordantread_detect()
  File "/home/yuguo/anaconda3/envs/repeatexplorer/bin/ECCsplorer/lib/eccMapper.py", line 437, in run_discordantread_detect
    min_coverage_allowed = int(int(max_coverage) * config.BACKGROUND_PERC)
ValueError: invalid literal for int() with base 10: b'1.37998e+06\n'
2023-02-04 02:23:52,577 - [r_shutdown] INFO: Shutting down Rserve.
2023-02-04 02:23:52,578 - [exit_err] ERROR: Sorry, something went wrong.

Is there any solution or suggestion to solve this problem?

crimBubble commented 1 year ago

Hi @liyugo, the error occures due to a very high coverage with your data. Please find the line 437 in the file /home/yuguo/anaconda3/envs/repeatexplorer/bin/ECCsplorer/lib/eccMapper.py and change it from:

min_coverage_allowed = int(int(max_coverage) * config.BACKGROUND_PERC)

to the following

min_coverage_allowed = int(float(max_coverage) * config.BACKGROUND_PERC)

You can re-run the ECCsplorer pipeline with the exact same command as before and it will re-use the SAM-mapping files that are already there and not re-run the whole mapping.

liyugo commented 1 year ago

@crimBubble Thank you very much for your timely reply! I modified the code according to your suggestions, but a new error occurred.

Traceback (most recent call last):
  File "/home/yuguo/anaconda3/envs/repeatexplorer/bin/eccsplorer", line 35, in <module>
    from lib import eccMapper, eccClusterer, eccComparer
  File "/home/yuguo/anaconda3/envs/repeatexplorer/bin/ECCsplorer/lib/eccMapper.py", line 438
    min_coverage_allowed = int(float(max_coverage)) * config.BACKGROUND_PERC)
                                                                            ^
SyntaxError: invalid syntax

Do you have any other suggestions?

crimBubble commented 1 year ago

Dear @liyugo please remove the second ")"-bracket after max_coverage in the line you modified. It should look exactly like this:

min_coverage_allowed = int(float(max_coverage) * config.BACKGROUND_PERC)

liyugo commented 1 year ago

Haha, sorry about that.Thank you so much, I really appreciate it.