simoncchu / REPdenovo

A tool to construct repeats directly from raw reads
MIT License
16 stars 3 forks source link

TERefiner_1 error #11

Open Milad021 opened 6 years ago

Milad021 commented 6 years ago

Hi Chong,

I'am from France and I think I have some problems with TERefiner_1. I wanted to know is it important for the rest of my analyzes? I obtain my contigs.fa but after Scaffolding I have zero informations in my X_contig_pairs_info.txt_cov_info_with_cutoff.txt .

Running command: /usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa -o ./Siek14_Repeatelm_12X/contigs.fa_n o_dup.fa -c 0.9 -g ... rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa.itself.bam': Aucun fichier ou dossier de ce type Running command: /usr/local/src/REPdenovo/ContigsMerger -s 0.2 -i1 -6.0 -i2 -6.0 -x 15 -y 50 -k 10 -t 15 -m 1 -o ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merge.info ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa > ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa ... Running command: Running command: samtools faidx ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa ... [bwa_index] Pack FASTA... 0.12 sec [bwa_index] Construct BWT for the packed sequence... [bwa_index] 5.07 seconds elapse. [bwa_index] Update BWT... 0.09 sec [bwa_index] Pack forward-only FASTA... 0.07 sec [bwa_index] Construct SA from BWT and Occ... 2.19 sec [main] Version: 0.7.15-r1140 [main] CMD: bwa index ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa [main] Real time: 8.361 sec; CPU: 7.540 sec [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 4034 sequences (10001826 bp)... [M::process] read 5595 sequences (6324459 bp)... [M::mem_process_seqs] Processed 4034 reads in 71.228 CPU sec, 71.386 real sec [M::mem_process_seqs] Processed 5595 reads in 46.156 CPU sec, 46.256 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -a ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa [main] Real time: 117.819 sec; CPU: 117.456 sec Running command: /usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa -o ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa -c 0.85 -g ... rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.itself.bam': Aucun fichier ou dossier de ce type Running command: Running command: samtools faidx ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa ... [bwa_index] Pack FASTA... 0.08 sec [bwa_index] Construct BWT for the packed sequence... [bwa_index] 2.97 seconds elapse. [bwa_index] Update BWT... 0.06 sec [bwa_index] Pack forward-only FASTA... 0.05 sec [bwa_index] Construct SA from BWT and Occ... 1.39 sec [main] Version: 0.7.15-r1140 [main] CMD: bwa index ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa [main] Real time: 5.104 sec; CPU: 4.552 sec [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 4162 sequences (10002529 bp)... [M::process] read 1173 sequences (877888 bp)... [M::mem_process_seqs] Processed 4162 reads in 64.264 CPU sec, 64.418 real sec [M::mem_process_seqs] Processed 1173 reads in 7.060 CPU sec, 7.065 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -a ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa [main] Real time: 71.635 sec; CPU: 71.368 sec Running command: /usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa -o ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa.no_contained.fa -c 0.85 ... rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.bam': Aucun fichier ou dossier de ce type

It's very emergency because I need to finish my intenrship, Thank you so much !

Milad021 commented 6 years ago

And I have this in my repertory ! 30mer.temp_contigs.fa Asm_30_4880_29 Asm_40_64434_39 Asm_60_112338_59 contigs.fa.itself.sort.bam contigs.fa_no_dup.fa.merge.info 40mer.temp_contigs.fa Asm_30_610_29 Asm_40_8054_39 Asm_60_14042_59 contigs.fa.itself.sort.bam.bai dumped_30mers.txt 50mer.temp_contigs.fa Asm_30_78089_29 Asm_50_119523_49 Asm_60_1755_59 contigs.fa_no_dup.fa dumped_40mers.txt 60mer.temp_contigs.fa Asm_30_9761_29 Asm_50_14940_49 Asm_60_28084_59 contigs.fa_no_dup.fa.merged.fa dumped_50mers.txt Asm_30_1220_29 Asm_40_1006_39 Asm_50_1867_49 Asm_60_3510_59 contigs.fa_no_dup.fa.merged.fa.fai dumped_60mers.txt Asm_30_156178_29 Asm_40_128868_39 Asm_50_29880_49 Asm_60_438_59 contigs.fa_no_dup.fa.merged.fa.itself.sort.bam kmers_fq.fastq Asm_30_19522_29 Asm_40_16108_39 Asm_50_3735_49 Asm_60_56169_59 contigs.fa_no_dup.fa.merged.fa.itself.sort.bam.bai original_contigs_before_merging.fa Asm_30_2440_29 Asm_40_2013_39 Asm_50_466_49 Asm_60_7021_59 contigs.fa_no_dup.fa.merged.fa.no_dup.fa reads_coverage.txt Asm_30_305_29 Asm_40_32217_39 Asm_50_59761_49 Asm_60_877_59 contigs.fa_no_dup.fa.merged.fa.no_dup.fa.fai Asm_30_312356_29 Asm_40_4027_39 Asm_50_7470_49 contigs.fa contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.sort.bam Asm_30_39044_29 Asm_40_503_39 Asm_50_933_49 contigs.fa.fai contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.sort.bam.bai

simoncchu commented 6 years ago

Hi @Milad021 , Yes, it is important. Have you tried to re-compile TERefiner_1? Also, check the input of TERefiner, whether they are empty or not.

Milad021 commented 6 years ago

I tried to recompile TERefiner_1, but I got an error. In fact it seems that it doesn't found the api/BamReader.h . I tried multiple links from bamtools repository and build folder to REPdenovo/bamtools-master, and the file is never found

when i do REPdenovo/bamtools$ find ./ -name lib, i get nothing and it's asked when i do make for TERefiner.

Thanks for your time

simoncchu commented 6 years ago

I think you need to set the bamtools path in the makefile (under TERefiner folder). Usually, it works well and don't need to re-compile. But I do need to release a user-friendly installation package.

Milad021 commented 6 years ago

In fact there is ( maybe since recently ) two separated include folders for bamtools. I did a little modification in your Makefile before to success compiling with lot of warnings. Here is the thing :


BAMTOOLS=../bamtools/build/src/api/include
BAMTOOLSSHARED=../bamtools/build/src/include

CFLAGS =  -O3 -Wall -static -I$(BAMTOOLS) -I$(BAMTOOLSSHARED) -L$(BAMTOOLS_LD) -Wl,-rpath,$(BAMTOOLS_LD)

now it's compiled without errors, but i'm not sure everything is right.

Milad021 commented 6 years ago

Unfortunately still the same problem. And the same files are generated in my directory """"""""""""" [bwa_index] Pack FASTA... 0.01 sec [bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.11 seconds elapse. [bwa_index] Update BWT... 0.00 sec [bwa_index] Pack forward-only FASTA... 0.00 sec [bwa_index] Construct SA from BWT and Occ... 0.06 sec
[main] Version: 0.7.15-r1140 [main] CMD: bwa index ./Siek14_Repeatelm_12X/contigs.fa [main] Real time: 0.707 sec; CPU: 0.184 sec [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 2624 sequences (671503 bp)... [M::mem_process_seqs] Processed 2624 reads in 1.304 CPU sec, 1.306 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -a ./Siek14_Repeatelm_12X/contigs.fa ./Siek14_Repeatelm_12X/contigs.fa [main] Real time: 1.437 sec; CPU: 1.312 sec Running command: /usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa -o ./Siek14_Repeatelm_12X/contigs.fa _no_dup.fa -c 0.9 -g ... rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa.itself.bam': Aucun fichier ou dossier de ce type rm: impossible de supprimer './Siek14_Repeatelm_12X/contigs.fa.itself.bam': Aucun fichier ou dossier de ce type """""""""""""""

My directory

"""""""""""""""" contigs.fa contigs.fa.fai
contigs.fa.itself.sort.bam
contigs.fa.itself.sort.bam.bai
contigs.fa_no_dup.fa
contigs.fa_no_dup.fa.merged.fa
contigs.fa_no_dup.fa.merged.fa.fai
contigs.fa_no_dup.fa.merged.fa.itself.sort.bam
contigs.fa_no_dup.fa.merged.fa.itself.sort.bam.bai contigs.fa_no_dup.fa.merged.fa.no_dup.fa
contigs.fa_no_dup.fa.merged.fa.no_dup.fa.fai
contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.sort.bam contigs.fa_no_dup.fa.merged.fa.no_dup.fa.itself.sort.bam.bai contigs.fa_no_dup.fa.merge.info dumped_30mers.txt dumped_40mers.txt dumped_50mers.txt dumped_60mers.txt kmers_fq.fastq original_contigs_before_merging.fa reads_coverage.txt """"""""""""""""""" My config.txt : """""""" MIN_REPEAT_FREQ 100 RANGE_ASM_FREQ_DEC 2 RANGE_ASM_FREQ_GAP 0.8 K_MIN 30 K_MAX 60 K_INC 10 K_DFT 30 READ_LENGTH 151 GENOME_LENGTH 650000000 MIN_CONTIG_LENGTH 100 ASM_NODE_LENGTH_OFFSET -1 IS_DUPLICATE_REPEATS 0.85 COV_DIFF_CUTOFF 0.5 MIN_SUPPORT_PAIRS 20 MIN_FULLY_MAP_RATIO 0.2 TR_SIMILARITY 0.85 TREADS 15 BWA_PATH GLOBAL SAMTOOLS_PATH GLOBAL JELLYFISH_PATH GLOBAL

VELVET_PATH /usr/bin

VELVET_PATH /home/dygap/mbehzadi/src/velvet REFINER_PATH /usr/local/src/REPdenovo/TERefiner_1 CONTIGS_MERGER_PATH /usr/local/src/REPdenovo/ContigsMerger OUTPUT_FOLDER ./Siek14_Repeatelm_12X/ VERBOSE 1 """""""" grep '>' original_contigs_before_merging.fa | wc -l 2624 Thank you

simoncchu commented 6 years ago

Can you post the reported errors by running this command?

/usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa -o ./Siek14_Repeatelm_12X/contigs.fa
_no_dup.fa -c 0.9 -g
Milad021 commented 6 years ago

Thank you for the speedy respons ! I will do it tomorrow, as soon as I have access to my PC.

simoncchu commented 6 years ago

In case you will meet other issues, you can similarly compile ContigsMerger on your server. Then copy the newly compiled TERefiner_1 and ContigsMerger to the same level as main.py. Then try again.

Milad021 commented 6 years ago

Hi @Reedwarbler , By running this command, I have this error message : """""""""""" $ /usr/local/src/REPdenovo/TERefiner_1 /usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa -o ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa -c 0.9 -g Index file not found, now create it!!! Index file cannot be created!!! Bamtools ERROR: could not open input BAM file: ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam Cannot parse bam file """"""""""""

simoncchu commented 6 years ago

I think the error is caused before this step. The /Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam is empty? Also, can u similarly compile ContigsMerger on your server. Then copy the newly compiled TERefiner_1 and ContigsMerger to the same level as main.py. Then try again?

Milad021 commented 6 years ago

Hi @Reedwarbler I'm so sorry for my late response, because I will not be in the lab until Tuesday. As soon as I'm there I'll let you know. But if I can not manage TERefiner and ContigsMerger, can I send you my original_contigs_before_merging.fa ? And My question is ,will I need TERefiner and ContigsMerger for Scaffolding step?

Milad021 commented 6 years ago

Hi, the file doesn't look empty :


678K mai   31 17:30 contigs.fa.itself.sort.bam

the head is :


BAMsl@HD  VN:1.3  SO:coordinate
@SQ SN:NODE_1_length_176_cov_1.988636_30_31235_312356   LN:204
@SQ SN:NODE_2_length_106_cov_2.000000_30_31235_312356   LN:134
@SQ SN:NODE_6_length_76_cov_1.973684_30_31235_312356    LN:104
@SQ SN:NODE_7_length_72_cov_1.972222_30_31235_312356    LN:100

The ContigsMerger is compiled on the machine without error, I'm not sure if the error can from the change I made on the makefile for TERefiner. If not, I have no idea where does it came from.

simoncchu commented 6 years ago

Can u view this file?

samtools view contigs.fa.itself.sort.bam NODE_1_length_176_cov_1.988636_30_31235_312356
Milad021 commented 6 years ago
NODE_1_length_176_cov_1.988636_30_31235_312356  256 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   204M    *   0   0   *   *   NM:i:0  MD:Z:204    AS:i:204
NODE_2_length_236_cov_2.258475_30_15617_312356  256 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   204M60H *   0   0   *   *   NM:i:1  MD:Z:8A195  AS:i:199
NODE_3_length_161_cov_2.869565_30_7808_312356   256 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   42H147M *   0   0   *   *   NM:i:1  MD:Z:43G103 AS:i:142
NODE_8_length_242_cov_1.995868_40_12886_128868  256 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   17H204M59H  *   0   0   *   *   NM:i:0MD:Z:204  AS:i:204
NODE_12_length_229_cov_1.995633_50_11952_119523 272 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   14H204M59H  *   0   0   *   *   NM:i:0MD:Z:204  AS:i:204
NODE_83_length_174_cov_2.045977_50_3040_119523  272 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   62H160M *   0   0   *   *   NM:i:0  MD:Z:160    AS:i:160
NODE_83_length_174_cov_2.045977_50_3040_74700   272 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   62H160M *   0   0   *   *   NM:i:0  MD:Z:160    AS:i:160
NODE_82_length_106_cov_2.028302_50_3040_37350   272 NODE_1_length_176_cov_1.988636_30_31235_312356  1   0   62H92M  *   0   0   *   *   NM:i:0  MD:Z:92 AS:i:92
NODE_3_length_142_cov_1.992958_60_11233_112338  272 NODE_1_length_176_cov_1.988636_30_31235_312356  4   0   200M    *   0   0   *   *   NM:i:0  MD:Z:200    AS:i:200
NODE_365_length_167_cov_4.095809_30_3904_195220 272 NODE_1_length_176_cov_1.988636_30_31235_312356  10  0   195M    *   0   0   *   *   NM:i:3  MD:Z:34G33G105G20   AS:i:180
NODE_535_length_160_cov_4.800000_30_3040_97610  272 NODE_1_length_176_cov_1.988636_30_31235_312356  13  0   17H171M *   0   0   *   *   NM:i:4  MD:Z:31G31T1G69G35  AS:i:151
NODE_12_length_167_cov_2.029940_60_3040_112338  272 NODE_1_length_176_cov_1.988636_30_31235_312356  20  0   185M40H *   0   0   *   *   NM:i:2  MD:Z:24G33G126  AS:i:175
NODE_12_length_167_cov_2.029940_60_3040_70210   272 NODE_1_length_176_cov_1.988636_30_31235_312356  20  0   185M40H *   0   0   *   *   NM:i:2  MD:Z:24G33G126  AS:i:175
NODE_12_length_167_cov_2.029940_60_3040_35105   272 NODE_1_length_176_cov_1.988636_30_31235_312356  20  0   185M40H *   0   0   *   *   NM:i:2  MD:Z:24G33G126  AS:i:175
NODE_148_length_88_cov_4.352273_40_3221_128868  272 NODE_1_length_176_cov_1.988636_30_31235_312356  79  0   126M    *   0   0   *   *   NM:i:1  MD:Z:82A43  AS:i:121
NODE_204_length_88_cov_4.613636_40_3040_80540   272 NODE_1_length_176_cov_1.988636_30_31235_312356  79  0   126M    *   0   0   *   *   NM:i:2  MD:Z:11T70A43   AS:i:116
simoncchu commented 6 years ago

I mean this file: ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam, because u said:

By running this command, I have this error message :
""""""""""""
$ /usr/local/src/REPdenovo/TERefiner_1 /usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa -o ./Siek14_Repeatelm_12X/contigs.fa_no_dup.fa -c 0.9 -g
Index file not found, now create it!!!
Index file cannot be created!!!
Bamtools ERROR: could not open input BAM file: ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam
Cannot parse bam file
""""""""""""
Milad021 commented 6 years ago

Sorry I did n't understand. I give you the head of the file ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam Next I show you the output of the command you asked me for :


samtools view contigs.fa.itself.sort.bam NODE_1_length_176_cov_1.988636_30_31235_312356

Thanks for responding fast each time !

I did again all the REPdenovo commands and now i get a strange error :


/usr/local/src/REPdenovo/TERefiner_1 -P -b ./Siek14_Repeatelm_12X/contigs.fa.itself.sort.bam -r ./Siek14_Repeatelm_12X/contigs.fa -o ./Siek14_Repeatelm_12X/contigs.fa -c 0.9 -g
*** Error in `/usr/local/src/REPdenovo/TERefiner_1': free(): invalid next size (fast): 0x0000000001b97fd0 ***
======= Backtrace: =========
[0x54b087]
[0x550912]
[0x55112e]
[0x4215c0]
[0x423e9d]
[0x4027b5]
[0x5336b3]
[0x53393e]
[0x40699a]
======= Memory map: ========
00400000-00652000 r-xp 00000000 fe:04 921790                             /usr/local/src/REPdenovo/TERefiner_1
00851000-0085d000 rw-p 00251000 fe:04 921790                             /usr/local/src/REPdenovo/TERefiner_1
0085d000-00864000 rw-p 00000000 00:00 0 
01b35000-027e1000 rw-p 00000000 00:00 0                                  [heap]
7fab14000000-7fab1402c000 rw-p 00000000 00:00 0 
7fab1402c000-7fab18000000 ---p 00000000 00:00 0 
7fab18df3000-7fab18ecc000 rw-p 00000000 00:00 0 
7fab18fe4000-7fab1900e000 rw-p 00000000 00:00 0 
7ffe8ab88000-7ffe8aba9000 rw-p 00000000 00:00 0                          [stack]
7ffe8abd9000-7ffe8abdb000 r--p 00000000 00:00 0                          [vvar]
7ffe8abdb000-7ffe8abdd000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Abandon

I get a lot of free RAM

Milad021 commented 6 years ago

Hi @Reedwarbler do you have any idea ?