aidenlab / 3d-dna

3D de novo assembly (3D DNA) pipeline
MIT License
203 stars 56 forks source link

No FINAL* files created. #159

Open Jung19911124 opened 1 year ago

Jung19911124 commented 1 year ago

Hi, I used conda version of 3D-DNA (version 190716) to scaffold my genome. However, I have not got FINAL files including FINAL.fasta, FINAL.hic although no errors were not observed in log file.

I have obtained PREFIX_final.assembly, PREFIX_final.hic, PREFIX_final.fasta, PREFIX_final.asm, PREFIX_final.cprops, PREFIX_HiC.fasta, and PREFIX_HiC.assembly.

Is this result correct? Any advice would be greatly appreciated.

Best, Jung

liaomei1995 commented 1 year ago

Hi,

When I run the code 'run-asm-pipeline.sh -r 0 xx.fa mergednodups.txt', I have not got FINAL files including FINAL.fasta, FINAL.hic. There are no erros in the logfile, but have the warning messages ':| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: no explicit bundle size was listed. Will use the same one as listed for false positive size threshold: this is the most typical scenario. :| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: No input for label1 was provided. Default for label1 is ":::fragment" :| Warning: No input for label2 was provided. Default for label2 is ":::debris" :| Warning: No input for label1 was provided. Default for label1 is ":::fragment_" :| Warning: No input for label2 was provided. Default for label2 is ":::debris"' Kind regards, Aomei

dudcha commented 1 year ago

Hi Aomei,

This is not enough for me to diagnose. Please attach the full output including stdout and stderr and the list of files that did get generated. Did the pipeline finish at all or it is still running?

Olga

On Nov 2, 2022, at 10:14 PM, liaomei1995 @.***> wrote:

Hi,

When I run the code 'run-asm-pipeline.sh -r 0 xx.fa mergednodups.txt', I have not got FINAL files including FINAL.fasta, FINAL.hic. There are no erros in the logfile, but have the warning messages ':| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: no explicit bundle size was listed. Will use the same one as listed for false positive size threshold: this is the most typical scenario. :| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: No input for label1 was provided. Default for label1 is ":::fragment" :| Warning: No input for label2 was provided. Default for label2 is ":::debris" :| Warning: No input for label1 was provided. Default for label1 is ":::fragment_" :| Warning: No input for label2 was provided. Default for label2 is ":::debris"' Kind regards, Aomei

— Reply to this email directly, view it on GitHub https://github.com/aidenlab/3d-dna/issues/159#issuecomment-1301595985, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLAMG4SWRPNPBZ7BPE72QDWGMUZBANCNFSM6AAAAAARTZOC3Q. You are receiving this because you are subscribed to this thread.

liaomei1995 commented 1 year ago

Dear sir / madam, Please find the attached picture. Kind regards,Aomei


李傲梅 助理研究员 Aomei Li, Assistant research fellow


广西南宁市大学东路174号 广西壮族自治区农业科学院甘蔗研究所 530007 Add: Daxue East Road NO. 174, Sugarcane Research Institute, Guangxi Academy of Agricultural Sciences, Nanning, 530007 China Tel: 86-0771-3899390 E-Mail: @.***

----- 原始邮件 ----- 发件人:dudcha @.> 收件人:aidenlab/3d-dna @.> 抄送人:liaomei1995 @.>, Comment @.> 主题:Re: [aidenlab/3d-dna] No FINAL* files created. (Issue #159) 日期:2022年11月03日 11点44分

Hi Aomei,

This is not enough for me to diagnose. Please attach the full output including stdout and stderr and the list of files that did get generated. Did the pipeline finish at all or it is still running?

Olga

On Nov 2, 2022, at 10:14 PM, liaomei1995 @.***> wrote:

Hi,

When I run the code 'run-asm-pipeline.sh -r 0 xx.fa merged_nodups.txt', I have not got FINAL files including FINAL.fasta, FINAL.hic. There are no erros in the logfile, but have the warning messages ':| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization!

:| Warning: no explicit bundle size was listed. Will use the same one as listed for false positive size threshold: this is the most typical scenario.

:| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization!

:| Warning: No input for label1 was provided. Default for label1 is ":::fragment_"

:| Warning: No input for label2 was provided. Default for label2 is ":::debris"

:| Warning: No input for label1 was provided. Default for label1 is ":::fragment_"

:| Warning: No input for label2 was provided. Default for label2 is ":::debris"'

Kind regards,

Aomei

Reply to this email directly, view it on GitHub https://github.com/aidenlab/3d-dna/issues/159#issuecomment-1301595985, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLAMG4SWRPNPBZ7BPE72QDWGMUZBANCNFSM6AAAAAARTZOC3Q.

You are receiving this because you are subscribed to this thread.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

dudcha commented 1 year ago

Hey Aomei, There appears to be no attachment, sorry. Olga

acboulet commented 1 year ago

Hello Olga. I was hoping to piggyback on this issue thread because I am having the same issue.

I'm trying to run the pipeline using the Human data from here: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE95797.

Below is the list of all the files in the output directory, and I've attached the log. human-3ddna-log.txt

Any advice on how to troubleshoot the issue would be greatly appreciated.

[/testing/3DDNA/human/stdenv]
$ ls -lh
total 25G
-rw-r----- 1 user group   55K Mar 19 04:36 archive.GSE95797_Hs1.edits.at.step.1.txt
-rw-r----- 1 user group  5.2K Mar 19 01:27 coverage_wide.at.step.0.dist.txt
-rw-r----- 1 user gro
up  2.1M Mar 19 01:27 coverage_wide.at.step.0.wig
-rw-r----- 1 user group  5.2K Mar 19 04:36 coverage_wide.at.step.1.dist.txt
-rw-r----- 1 user group  2.1M Mar 19 04:36 coverage_wide.at.step.1.wig
-rw-r----- 1 user group   36M Mar 19 01:27 depletion_score_narrow.at.step.0.wig
-rw-r----- 1 user group   36M Mar 19 04:36 depletion_score_narrow.at.step.1.wig
-rw-r----- 1 user group  2.0M Mar 19 01:25 depletion_score_wide.at.step.0.wig
-rw-r----- 1 user group  2.0M Mar 19 04:34 depletion_score_wide.at.step.1.wig
-rw-r----- 1 user group   65K Mar 19 01:27 edits.for.step.1.txt
-rw-r----- 1 user group   27K Mar 19 04:36 edits.for.step.2.txt
-rw-r----- 1 user group  437K Mar 19 00:12 GSE95797_Hs1.0.asm
-rw-r----- 1 user group   11M Mar 19 00:13 GSE95797_Hs1.0_asm.scaffold_track.txt
-rw-r----- 1 user group  5.0M Mar 19 00:13 GSE95797_Hs1.0_asm.superscaf_track.txt
-rw-r----- 1 user group  2.8M Mar 19 00:13 GSE95797_Hs1.0.assembly
lrwxrwxrwx 1 user group    19 Mar 18 22:36 GSE95797_Hs1.0.cprops -> GSE95797_Hs1.cprops
-rw-r----- 1 user group 1008M Mar 19 01:24 GSE95797_Hs1.0.hic
-rw-r----- 1 user group  441K Mar 19 03:08 GSE95797_Hs1.1.asm
-rw-r----- 1 user group   11M Mar 19 03:09 GSE95797_Hs1.1_asm.scaffold_track.txt
-rw-r----- 1 user group  5.1M Mar 19 03:09 GSE95797_Hs1.1_asm.superscaf_track.txt
-rw-r----- 1 user group  2.9M Mar 19 03:09 GSE95797_Hs1.1.assembly
-rw-r----- 1 user group  2.4M Mar 19 01:27 GSE95797_Hs1.1.cprops
-rw-r----- 1 user group 1008M Mar 19 04:33 GSE95797_Hs1.1.hic
-rw-r----- 1 user group  442K Mar 19 06:13 GSE95797_Hs1.2.asm
-rw-r----- 1 user group   11M Mar 19 06:14 GSE95797_Hs1.2_asm.scaffold_track.txt
-rw-r----- 1 user group  5.1M Mar 19 06:14 GSE95797_Hs1.2_asm.superscaf_track.txt
-rw-r----- 1 user group  2.9M Mar 19 06:14 GSE95797_Hs1.2.assembly
-rw-r----- 1 user group  2.4M Mar 19 04:36 GSE95797_Hs1.2.cprops
-rw-r----- 1 user group 1009M Mar 19 07:41 GSE95797_Hs1.2.hic
-rw-r----- 1 user group  2.3M Mar 18 22:36 GSE95797_Hs1.cprops
-rw-r----- 1 user group   76K Mar 19 04:36 GSE95797_Hs1.edits.txt
lrwxrwxrwx 1 user group    25 Mar 19 11:56 GSE95797_Hs1.final.asm -> GSE95797_Hs1.rawchrom.asm
-rw-r----- 1 user group   11M Mar 20 09:30 GSE95797_Hs1.final_asm.scaffold_track.txt
-rw-r----- 1 user group  5.1M Mar 20 09:30 GSE95797_Hs1.final_asm.superscaf_track.txt
lrwxrwxrwx 1 user group    30 Mar 19 11:56 GSE95797_Hs1.final.assembly -> GSE95797_Hs1.rawchrom.assembly
lrwxrwxrwx 1 user group    28 Mar 19 11:56 GSE95797_Hs1.final.cprops -> GSE95797_Hs1.rawchrom.cprops
lrwxrwxrwx 1 user group    25 Mar 19 11:56 GSE95797_Hs1.final.hic -> GSE95797_Hs1.rawchrom.hic
-rw-r----- 1 user group  6.9G Mar 20 09:30 GSE95797_Hs1.final.mnd.txt
-rw-r----- 1 user group  3.1M Mar 19 14:00 GSE95797_Hs1_HiC.assembly
-rw-r----- 1 user group  2.7G Mar 19 14:00 GSE95797_Hs1_HiC.fasta
-rw-r----- 1 user group  161M Mar 19 12:24 GSE95797_Hs1_HiC.hic
lrwxrwxrwx 1 user group    98 Mar 20 09:29 GSE95797_Hs1.mnd.txt -> /gpfs/gifs_project/platforms/dma/references/tools/3DDNA/test_datasets/Hs2-HiC/GSE95797_Hs1.mnd.txt
-rw-r----- 1 user group  442K Mar 19 07:48 GSE95797_Hs1.polished.asm
-rw-r----- 1 user group   11M Mar 19 07:48 GSE95797_Hs1.polished_asm.scaffold_track.txt
-rw-r----- 1 user group  5.1M Mar 19 07:48 GSE95797_Hs1.polished_asm.superscaf_track.txt
-rw-r----- 1 user group  2.4M Mar 19 07:44 GSE95797_Hs1.polished.cprops
-rw-r----- 1 user group   38M Mar 19 07:44 GSE95797_Hs1.polished.depletion_score_narrow.wig
-rw-r----- 1 user group  498K Mar 19 07:41 GSE95797_Hs1.polished.depletion_score_wide.wig
-rw-r----- 1 user group  6.2K Mar 19 07:44 GSE95797_Hs1.polished.edits_2D.txt
-rw-r----- 1 user group 1009M Mar 19 09:11 GSE95797_Hs1.polished.hic
-rw-r----- 1 user group  6.8K Mar 19 07:44 GSE95797_Hs1.polished.mismatches_2D.txt
-rw-r----- 1 user group  343K Mar 19 07:44 GSE95797_Hs1.polished.mismatch_narrow.bed
-rw-r----- 1 user group  1.6K Mar 19 07:41 GSE95797_Hs1.polished.mismatch_wide.bed
-rw-r----- 1 user group  2.9M Mar 19 09:15 GSE95797_Hs1.polished.split.assembly
-rw-r----- 1 user group   38M Mar 19 09:14 GSE95797_Hs1.polished.split.depletion_score_narrow.wig
-rw-r----- 1 user group  498K Mar 19 09:11 GSE95797_Hs1.polished.split.depletion_score_wide.wig
-rw-r----- 1 user group  2.0K Mar 19 09:14 GSE95797_Hs1.polished.split.edits_2D.txt
-rw-r----- 1 user group   11K Mar 19 09:14 GSE95797_Hs1.polished.split.mismatches_2D.txt
-rw-r----- 1 user group  343K Mar 19 09:14 GSE95797_Hs1.polished.split.mismatch_narrow.bed
-rw-r----- 1 user group  1.6K Mar 19 09:11 GSE95797_Hs1.polished.split.mismatch_wide.bed
-rw-r----- 1 user group   13K Mar 19 09:14 GSE95797_Hs1.polished.split.suspicious_2D.txt
-rw-r----- 1 user group   13K Mar 19 07:44 GSE95797_Hs1.polished.suspect_2D.txt
-rw-r----- 1 user group  439K Mar 20 09:29 GSE95797_Hs1.rawchrom.asm
-rw-r----- 1 user group   11M Mar 19 10:38 GSE95797_Hs1.rawchrom_asm.scaffold_track.txt
-rw-r----- 1 user group  5.1M Mar 19 10:38 GSE95797_Hs1.rawchrom_asm.superscaf_track.txt
-rw-r----- 1 user group  2.9M Mar 20 09:30 GSE95797_Hs1.rawchrom.assembly
-rw-r----- 1 user group  2.4M Mar 20 09:29 GSE95797_Hs1.rawchrom.cprops
-rw-r----- 1 user group  2.7G Mar 19 13:41 GSE95797_Hs1.rawchrom.fasta
-rw-r----- 1 user group  781M Mar 20 10:11 GSE95797_Hs1.rawchrom.hic
lrwxrwxrwx 1 user group    18 Mar 19 07:41 GSE95797_Hs1.resolved.asm -> GSE95797_Hs1.2.asm
lrwxrwxrwx 1 user group    37 Mar 19 07:41 GSE95797_Hs1.resolved_asm.scaffold_track.txt -> GSE95797_Hs1.2_asm.scaffold_track.txt
lrwxrwxrwx 1 user group    38 Mar 19 07:41 GSE95797_Hs1.resolved_asm.superscaf_track.txt -> GSE95797_Hs1.2_asm.superscaf_track.txt
lrwxrwxrwx 1 user group    21 Mar 19 07:41 GSE95797_Hs1.resolved.cprops -> GSE95797_Hs1.2.cprops
lrwxrwxrwx 1 user group    18 Mar 19 07:41 GSE95797_Hs1.resolved.hic -> GSE95797_Hs1.2.hic
-rw-r----- 1 user group  2.9M Mar 19 07:48 GSE95797_Hs1.resolved.polish.assembly
-rw-r----- 1 user group  443K Mar 19 09:14 GSE95797_Hs1.split.asm
-rw-r----- 1 user group   11M Mar 19 09:15 GSE95797_Hs1.split_asm.scaffold_track.txt
-rw-r----- 1 user group  5.1M Mar 19 09:15 GSE95797_Hs1.split_asm.superscaf_track.txt
-rw-r----- 1 user group  2.9M Mar 19 10:37 GSE95797_Hs1.split.assembly
-rw-r----- 1 user group  2.4M Mar 19 09:14 GSE95797_Hs1.split.cprops
-rw-r----- 1 user group 1009M Mar 19 10:37 GSE95797_Hs1.split.hic
-rw-r----- 1 user group  1.7K Mar 19 09:14 h.edits.txt
-rw-r----- 1 user group   25K Mar 19 01:27 mismatches.at.step.0.txt
-rw-r----- 1 user group   25K Mar 19 04:36 mismatches.at.step.1.txt
-rw-r----- 1 user group  372K Mar 19 01:27 mismatch_narrow.at.step.0.bed
-rw-r----- 1 user group  376K Mar 19 04:36 mismatch_narrow.at.step.1.bed
-rw-r----- 1 user group  6.6K Mar 19 01:25 mismatch_wide.at.step.0.bed
-rw-r----- 1 user group  5.1K Mar 19 04:34 mismatch_wide.at.step.1.bed
-rw-r----- 1 user group  3.3K Mar 19 01:27 repeats_wide.at.step.0.bed
-rw-r----- 1 user group  3.2K Mar 19 04:36 repeats_wide.at.step.1.bed
-rw-r----- 1 user group   90K Mar 19 01:27 suspect_2D.at.step.0.txt
-rw-r----- 1 user group   52K Mar 19 04:36 suspect_2D.at.step.1.txt
-rw-r----- 1 user group  372K Mar 19 01:27 suspect.at.step.0.bed
-rw-r----- 1 user group  374K Mar 19 04:36 suspect.at.step.1.bed
-rw-r----- 1 user group  5.7G Mar 20 09:30 temp.GSE95797_Hs1.final.asm_mnd.txt
dudcha commented 1 year ago

Hey,

I looked though the log: it looks fine to me. Something went wrong with the chrom boundary detection, but you should have no trouble checking via JBAT. If you are concerned about the warnings they are indeed just the warnings. I've removed them in the latest release since they understandably unnerved people. Thanks,

Olga

acboulet commented 1 year ago

Thank you for the quick response. Reviewing the 'GSE95797_Hs1.final.hic -> GSE95797_Hs1.rawchrom.hic' contact map, and 'GSE95797_Hs1.final.assembly -> GSE95797_Hs1.rawchrom.assembly' assembly files with JBAT look promising.

How do I extract that fasta corresponding to these scaffolds? The 'GSE95797_Hs1.rawchrom.fasta' seems to correspond to the contigs.

dudcha commented 1 year ago

Please check the genome assembly cookbook for general guidance. https://www.dnazoo.org/methodsOn Mar 20, 2023, at 12:17 PM, acboulet @.***> wrote: Thank you for the quick response. Reviewing the 'GSE95797_Hs1.final.hic -> GSE95797_Hs1.rawchrom.hic' contact map, and 'GSE95797_Hs1.final.assembly -> GSE95797_Hs1.rawchrom.assembly' assembly files with JBAT look promising. How do I extract that fasta corresponding to these scaffolds? The 'GSE95797_Hs1.rawchrom.fasta' seems to correspond to the contigs.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

Yang-xia13 commented 1 month ago

Hi Aomei, This is not enough for me to diagnose. Please attach the full output including stdout and stderr and the list of files that did get generated. Did the pipeline finish at all or it is still running? Olga On Nov 2, 2022, at 10:14 PM, liaomei1995 @.***> wrote: Hi, When I run the code 'run-asm-pipeline.sh -r 0 xx.fa mergednodups.txt', I have not got FINAL files including FINAL.fasta, FINAL.hic. There are no erros in the logfile, but have the warning messages ':| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: no explicit bundle size was listed. Will use the same one as listed for false positive size threshold: this is the most typical scenario. :| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: No input for label1 was provided. Default for label1 is ":::fragment" :| Warning: No input for label2 was provided. Default for label2 is ":::debris" :| Warning: No input for label1 was provided. Default for label1 is ":::fragment_" :| Warning: No input for label2 was provided. Default for label2 is ":::debris"' Kind regards, Aomei — Reply to this email directly, view it on GitHub <#159 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLAMG4SWRPNPBZ7BPE72QDWGMUZBANCNFSM6AAAAAARTZOC3Q. You are receiving this because you are subscribed to this thread.

HI Olga, I also meet this error. When I run the code 'run-asm-pipeline.sh -r 0 xx.fa mergednodups.txt', I have not got FINAL files including FINAL.fasta, FINAL.hic. There are no erros in the logfile, but have the warning messages ':| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: no explicit bundle size was listed. Will use the same one as listed for false positive size threshold: this is the most typical scenario. :| WARNING: GNU Parallel version 20150322 or later not installed. We highly recommend to install it to increase performance. Starting pipeline without parallelization! :| Warning: No input for label1 was provided. Default for label1 is ":::fragment" :| Warning: No input for label2 was provided. Default for label2 is ":::debris" :| Warning: No input for label1 was provided. Default for label1 is ":::fragment_" :| Warning: No input for label2 was provided. Default for label2 is ":::debris"'

Can you help me with this, thank you very much Kind regards,Yang