Oshlack / necklace

Combine reference and assembled transcriptomes for RNA-Seq analysis
https://github.com/Oshlack/necklace/wiki
GNU General Public License v3.0
21 stars 5 forks source link

stuck at merge_genome_annotation step #15

Closed stephanyfoster closed 3 years ago

stephanyfoster commented 4 years ago

Hello-

I previously concatenated five transcriptomes to run necklace (after issue#13) and I get stuck at the merge_genome_annotation step. I allocate 96 hours on a computing node to run necklace and that is not enough time to get past this step so I have some questions

my five transcriptomes are 32.9-35.8MB, how long would you expect a necklace run to take in this place? Or for maybe two transcriptomes?

nadiadavidson commented 4 years ago

This step usually just takes seconds to minutes. It may be that the program is still performing de novo assembly and the merge_genome_annotation step finished (they are usually done it parallel). If you provided the de novo assemblies something else must be wrong. I'm happy to investigate if you send more information like the the output of the program (bpipe log) and also what files it's currently created (ls -lh *).

stephanyfoster commented 4 years ago

I attached a screenshot of what I see for the bpipe log and as for the files currently created I see three directories, de_novo_assembly (165M), genome_guided_assembly (5.6G), and genome_superTranscriptome (0). Does this give more information or is there something else I can do? At this point, I have necklace running with 2 transcriptomes (not5) for 24 hours

Screen Shot 2020-03-06 at 3 51 41 PM

nadiadavidson commented 4 years ago

Hi, can you please send me the results of an ls -lh in each of the three subdirectories so I can which files have already been made and which are missing. Hopefully it's just that the de novo assembly hasn't finished and things will complete with time, although it's a bit worrying that the genome_superTranscriptome directory appears to be empty. Can you also send me your configuration/input file as well as the contents of the tools.groovy file in directory where necklace is installed.

stephanyfoster commented 4 years ago

sure: Screen Shot 2020-03-06 at 10 04 09 PM Screen Shot 2020-03-06 at 10 02 27 PM Screen Shot 2020-03-06 at 10 01 39 PM

stephanyfoster commented 4 years ago

Screen Shot 2020-03-06 at 10 17 02 PM

nadiadavidson commented 4 years ago

Hi, thanks for this. I can see now that it does appear to be stuck in the merge_genome_annotation step. I can see a couple of issues with your configuration file that hopefully sort this out.

  1. You need to provide necklace with a genome annotation for your species. ie. a .gtf file.
  2. You have set the genome file twice. Set this to which ever is the genome of the species you sequenced (Manbe Pmin in your case?). For the related species you should supply protein sequences in fasta format. See https://github.com/Oshlack/necklace/blob/master/input_template.config or https://github.com/Oshlack/necklace/wiki/Getting-Started 1. for an example configuration file. All variable in this must be set. We aware that we changed the configuration file template at version 1.11, so now protein sequence is needed for the related species instead of a reference genome and annotation. Cheers, Nadia.
stephanyfoster commented 4 years ago

Can I do this at all without a gtf file?

nadiadavidson commented 4 years ago

You maybe able to get away with a dummy .gtf file (ie. empty), but we have not tried this.