baoe / AlignGraph

Algorithm for secondary de novo genome assembly guided by closely related references
166 stars 23 forks source link

AlignGraph running indefinitely #9

Open ramesh8v opened 9 years ago

ramesh8v commented 9 years ago

Hi, AlignGraph is running indefinitely in computer for more than two days, but both extended and remaining contig files are still empty. I could see so many files in tmp folder. May I know the problem? There is no issue with Bowtie2 and BLAT, both are working like a charm.

I appreciate any kind of help!

Thanks Ramesh

baoe commented 9 years ago

Hi, Ramesh,

You may check if there are outputs like steps (1), (2), (3)... of AlignGraph. If yes, it means AlignGraph is processing the reference chromosomes one by one and you will know the progress; if not, it means AlignGraph is still doing the alignment and you have to wait for more time.

For the alignment stage, many contigs are compared with many reference chromosomes, so it may be time consuming. A suggestion besides specifying the -fastMap option is to combine the sequences if the reference genome contains not the complete chromosomes but long contigs/scaffolds.

Currently, AlignGraph's runtime performance may still not be satisfied, but it may take some more time to release an accelerated version...

Best, Bao


From: ramesh8v [notifications@github.com] Sent: Sunday, May 03, 2015 9:59 AM To: baoe/AlignGraph Subject: [AlignGraph] AlignGraph running indefinitely (#9)

Hi, AlignGraph is running indefinitely in computer for more than two days, but both extended and remaining contig files are still empty. I could see so many files in tmp folder. May I know the problem? There is no issue with Bowtie2 and BLAT, both are working like a charm.

I appreciate any kind of help!

Thanks Ramesh

— Reply to this email directly or view it on GitHubhttps://github.com/baoe/AlignGraph/issues/9.

baoe commented 9 years ago

Hi,

It's possible to have an empty extend_contig file, but at the same time, you would get a remaining_contig file with all the initial contigs. If you got both empty, there would be something wrong. Could you check the bowtie_doc.txt and blat_doc.txt files to make sure the alignments were going correctly?

Best, Bao


From: bisnow33 [notifications@github.com] Sent: Friday, June 05, 2015 12:42 AM To: baoe/AlignGraph Cc: Bao Subject: Re: [AlignGraph] AlignGraph running indefinitely (#9)

Hi , I run AlignGraph after abyss on the whole genome of Arabidopsis thaliana. I got in result a big forlder tmp (about 300 go, with 7 files .fa named by number of chromosomes ) but nothing in the extend_contig and remaining_contig files , is that mean there is no contigs extended ?

— Reply to this email directly or view it on GitHubhttps://github.com/baoe/AlignGraph/issues/9#issuecomment-109192435.

bisnow33 commented 9 years ago

Hi, Thank you for your answer.In fact my job wasn't finish, now it ok i got my to files ! I will analyse my result , but i think it will not resolve my problem (i am trying to construct reference in repeats sequences .....) Best,

Tristan

baoe commented 9 years ago

Great. Good luck!

Bao


From: bisnow33 [notifications@github.com] Sent: Saturday, June 06, 2015 6:55 AM To: baoe/AlignGraph Cc: Bao Subject: Re: [AlignGraph] AlignGraph running indefinitely (#9)

Hi, Thank you for your answer.In fact my job wasn't finish, now it ok i got my to files ! I will analyse my result , but i think it will not resolve my problem (i am trying to construct reference in repeats sequences .....) Best,

Tristan

— Reply to this email directly or view it on GitHubhttps://github.com/baoe/AlignGraph/issues/9#issuecomment-109581078.

bisnow33 commented 9 years ago

Just a last little question about the output : i obtained this kind of result :

AlignGraph0 @ AtChr1 CHROMOSOME dumped from ADB: Feb/3/09 16:9; last updated: 2007-12-20 : 799038 38781 2221398 595597+,19044-,110612+ ; 804606 81080 4670079 539284+,99N,799238- ; 806284 60171 3679382 796899-,99N,803245+ ;

I did not really understood the meaning of numbers. How can i grow my sequences of references ? ( i mean merge results with the references)

Tristan

baoe commented 9 years ago

Hi, Tristan,

You may check section 5 of the github website to know the output format. In your case, you have the following contigs combined: 799038 38781 2221398 595597+,19044-,110612+ ; 804606 81080 4670079 539284+,99N,799238- ; 806284 60171 3679382 796899-,99N,803245+ ; with this reference chromosome: AtChr1 CHROMOSOME dumped from ADB: Feb/3/09 16:9; last updated: 2007-12-20

Best, Bao


From: bisnow33 [notifications@github.com] Sent: Monday, June 08, 2015 2:08 AM To: baoe/AlignGraph Cc: Bao Subject: Re: [AlignGraph] AlignGraph running indefinitely (#9)

Just a last little question about the output : i obtained this kind of result :

AlignGraph0 @ AtChr1 CHROMOSOME dumped from ADB: Feb/3/09 16:9; last updated: 2007-12-20 : 799038 38781 2221398 595597+,19044-,110612+ ; 804606 81080 4670079 539284+,99N,799238- ; 806284 60171 3679382 796899-,99N,803245+ ;

I did not really understood the meaning of numbers. How can i grow my sequences of references ? ( i mean merge results with the references)

Tristan

— Reply to this email directly or view it on GitHubhttps://github.com/baoe/AlignGraph/issues/9#issuecomment-109919599.

bisnow33 commented 9 years ago

Hi, I already check it (before asking ^^) but what i was not sure is about the construction of the line ; is that mean for the first line 3 contigs merge together named 799038 38781 2221398 and the other number mean the position and +/- the strand ?

An other question , i got an an other output file name ex.fa but i did not find information about it. What it contains ?

Sorry for all this questions but i am new in assembly ..... And it s not trivial for me ....

Best,

Tristan

baoe commented 9 years ago

(1) 799038 38781 2221398 595597+,19044-,110612+ would be one complete contig name, since the output format is: AlignGraphX @ chromosomeID : contig/scaffoldID ; contig/scaffoldID ; contig/scaffoldID ... : partY, where chromosomeID is the specification of the reference chromosome used to generate the extended contig or scaffold, X is a number starting from 0 to identify the extended contig or scaffold for each reference chromosome, and contig/scaffoldIDs are the specifications of the extendable contigs or scaffolds.

(2) Don't worry about the ex.fa file. It's just something I generated to test AlignGraph. I will disable it in the following versions.

Best, Bao


From: bisnow33 [notifications@github.com] Sent: Tuesday, June 09, 2015 1:21 AM To: baoe/AlignGraph Cc: Bao Subject: Re: [AlignGraph] AlignGraph running indefinitely (#9)

Hi, I already check it (before asking ^^) but what i was not sure is about the construction of the line ; is that mean for the first line 3 contigs merge together named 799038 38781 2221398 and the other number mean the position and +/- the strand ?

An other question , i got an an other output file name ex.fa but i did not find information about it. What it contains ?

Sorry for all this questions but i am new in assembly ..... And it s not trivial for me ....

Best,

Tristan

— Reply to this email directly or view it on GitHubhttps://github.com/baoe/AlignGraph/issues/9#issuecomment-110272624.

sowmyakothapalli commented 9 years ago

Hi, I am new to this field. I was supposed to build reference assembly for rufous-collared sparrow genome using AlignGraph. I am not getting any error as well as remaining and extendedContig files are not created at all i.e I dont get any output. Can anyone help me!

baoe commented 9 years ago

Hi,

You could refer to FAQ 5 for the answer: use -fastMap option and check the outputs to know the progress of AlignGraph.

Best, Bao


From: sowmyakothapalli [notifications@github.com] Sent: Sunday, July 05, 2015 7:24 AM To: baoe/AlignGraph Cc: Bao Subject: Re: [AlignGraph] AlignGraph running indefinitely (#9)

Hi, I am new to this field. I was supposed to build reference assembly for rufous-collared sparrow genome using AlignGraph. I am not getting any error as well as remaining and extendedContig files are not created at all i.e I dont get any output. Can anyone help me!

— Reply to this email directly or view it on GitHubhttps://github.com/baoe/AlignGraph/issues/9#issuecomment-118626103.

sowmyakothapalli commented 9 years ago

Hi Bao, this is my input. I feel that contig and genome files that I gave are also PE DNA files. After using --fastMap its running forever and not giving any output. Can you suggest me an answer for my questions! AlignGraph --read1 Capensis-3_GAATTCGT-TATAGCCT_AC6GPLANXX_L001_001.R1.fasta --read2 Capensis-3_GAATTCGT-TATAGCCT_AC6GPLANXX_L001_001.R2.fasta --contig Capensis-3_GAATTCGT-TATAGCCT_AC6GPLANXX_L002_001.R1.fasta --genome Capensis-3_GAATTCGT-TATAGCCT_AC6GPLANXX_L002_001.R2.fasta --distanceLow 100 --distanceHigh 1500 --extendedContig Sparrow1.fasta --remainingContig Sparrow2.fasta --fastMap

Thanks, Sowmya

baoe commented 8 years ago

Hi,

The current version of AlignGraph supports NUCMER to avoid the infinite BLAT running. Hope the new version could work for you.

Best, Bao