songweizhi / MarkerMAG

Linking MAGs with 16S rRNA marker genes
GNU Affero General Public License v3.0
21 stars 2 forks source link

samtools sort error: No such file or directory #2

Closed metalichen closed 2 years ago

metalichen commented 2 years ago

Hello,

Thanks for your tool! I installed the 1.1.5 version today, and ran into a problem running the link module

>MarkerMAG link -p tmp_SRR13125477 -marker ../03_metagenome_reanalysis/assembly_SRR13125477.bacteria.fasta -mag selected_mags/SRR13125477/ -x fa -r1 SRR13125477_R1.fasta -r2 SRR13125477_R2.fasta -t 12 -o tmp_SRR13125477 -no_polish

[2022-02-02 06:30:25] parameters for linking
 + mismatch:    2%
 + min_M_len:   45bp
 + min_M_pct:   35%
 + min_link_num_gnm:    9
 + min_link_num_ctg:    3
 + rd2_end_seq_len:     1000bp
 + max_short_cigar_pct: 75,85
[2022-02-02 06:30:25] parameters for estimating copy number
 + MAG_cov_subsample_pct:       25%
 + min_insert_size_16s: -1000bp
 + ignore_ends_len_16s: 150bp
 + ignore_lowest_pct:   25%
 + ignore_highest_pct:  25%
 + both_pair_mapped:    False
[2022-02-02 06:30:29] Rd1: quality control provided 16S rRNA gene sequences to:  
[2022-02-02 06:30:29] Rd1: remove sequences shorter than 1200 bp
[2022-02-02 06:30:29] Rd1: cluster at 99% identity and keep only the longest one in each cluster
[2022-02-02 06:30:29] Rd1: qualified 16S rRNA gene sequences exported to:
[2022-02-02 06:30:29] assembly_SRR13125477.bacteria_unpolished_min1200bp_c99.fasta
[2022-02-02 06:30:30] Rd1: mapping input reads to marker genes
[2022-02-02 06:30:30] Rd1: sorting mappping results
Traceback (most recent call last):
  File "/data/tagirdzh/miniconda3/bin/MarkerMAG", line 133, in <module>
    link_16s.link_16s(args, config_dict)
  File "/data/tagirdzh/miniconda3/lib/python3.8/site-packages/MarkerMAG/link_16s.py", line 2765, in link_16s
    os.remove(input_reads_to_16s_sam_bowtie)
FileNotFoundError: [Errno 2] No such file or directory: 'tmp_SRR13125477/tmp_SRR13125477_rd1_wd/tmp_SRR13125477_input_reads_to_16S.sam'
[E::hts_open_format] Failed to open file tmp_SRR13125477/tmp_SRR13125477_rd1_wd/tmp_SRR13125477_input_reads_to_16S.sam
samtools sort: can't open "tmp_SRR13125477/tmp_SRR13125477_rd1_wd/tmp_SRR13125477_input_reads_to_16S.sam": No such file or directory

The mentioned .sam file, is, however, there. I also noticed that already after the program finished with the error, bowtie2-align was still running in the background for some time.

I'm using samtools v1.7 and bowtie2-align-s version 2.3.5.1

Am I doing something wrong?

Thanks, Gulnara

songweizhi commented 2 years ago

Hi Gulnara, thanks for using MarkerMAG, can you try with the demo dataset and see if the error still exists? Weizhi

https://drive.google.com/drive/folders/1edzpj6QV6jRQ24F1wT_9pIDzOIV_b3ki?usp=sharing

aistBMRG commented 2 years ago

I am running into a similar issue.

I was able to resolve by modifying the source code in link_16s.py a bit (removing the if-else statement starting at line 2747, "if reads_vs_16s_sam is None:", but keeping the mapping and sorting commands). In that case, mapping completed prior to sorting and the expected files could be generated, that is the sam file and sorted sam file.

However, the sorted sam file seems not be able to be found at the next command (os.system('wc -l %s > %s' % (input_reads_to_16s_sam_sorted, input_reads_to_16s_sam_sorted_line_num))). Running that part then manually seems to work ... Not sure what is going on ... could the developers provide some input?

Regards,

Dieter

songweizhi commented 2 years ago

Thanks for reporting the issue. This is strange, I now removed the "if-else statement", though I have no idea why it matters. Can you please upgrade MarkerMAG to 1.1.19 and see if you can bypass the error?

aistBMRG commented 2 years ago

Thanks. I tried the newer version but now again run into the issue the original poster mentioned, namely that sorting appears to start prior to completing the mapping (now seems that the if-else statement was not the cause of the issue I guess). As a results, the required sam file is not found although it appears to be present. Fyi, I am using Python 3.9.12 and Ubuntu 16.04.7 LTS.

Dieter.

songweizhi commented 2 years ago

I use "&> /dev/null" to hide STDOUT and STDERR reports from bowtie and samtools, not sure if this causes the error. Can you please remove it from the bowtie_read_to_16s_cmd and sort_by_read_cmd lines and let me know if the error still exist? Thanks, Weizhi

aistBMRG commented 2 years ago

Great.

That fixed the issue, at least for the initial mapping and sorting commands. It (that is, "&> /dev/null") may need to be removed at additional lines since a similar issue occurs later on. Actually, this was probably why I got the mapping running previously since I had probably also removed that part, if I recall correctly.

Thanks a lot for helping me resolve this issue.

Dieter

songweizhi commented 2 years ago

The troubling "&> /dev/null" has now been removed throughout all scripts in version 1.1.20. Cheers, Weizhi