ogotoh / spaln

Genome mapping and spliced alignment of cDNA or amino acid sequences
GNU General Public License v2.0
94 stars 16 forks source link

Transcript and genome to gff gives empty files #28

Open kristianHoden opened 4 years ago

kristianHoden commented 4 years ago

Dear Gotoh, I'm struggling with getting a gff file from my transcripts and genome files. I've run: makeidx.pl –ip path/to/genome.mfa spaln -Q1 -O0 -t32 path/to/genome.mfa path/to/transcripts.mfa > gff-file I've also tried: spaln -Q1 -O0 path/to/genome.mfa path/to/transcripts.mfa > gff-file I get no gff.file from this. The only error message I get is "killed".

I've also tried spaln -Q1 -O0 path/to/genome.mfa path/to/transcripts.mfa > gff-file giving: Segmentation fault (core dumped)
and: spaln -Q1 -O0 path/to/genome.mfa path/to/transcripts.mfa -o gff-file which is actually writing the gff-file in the terminal but without being captured in the gff-file.

spaln -Q1 -O0,2 path/to/genome.mfa path/to/transcripts.mfa -o gff-file gives: two empty outfiles before it's killed

I honestly don't understand the difference in -QN when N is 1,2 or 3. Maybe this is resulting in the problem.

Could you see what is the cause of the issue?

Thanks in advance, Kristian

ogotoh commented 4 years ago

Spaln has two major groups of running modes: –Q0-3 (spliced alignment between genomic segment and transcripts) and –Q4-7 (genome mapping and alignment). In your case, you might have to use the latter one. The simplest way to do so, you store both genomic and transcript sequences in the directory, ~/seqdb/. Then 1) $ cd ~/seqdb, 2) $ makeidx.pl –ip genome.mfa (or $ spaln –W –KP genome.mfa), 3) $ spaln –Q7 (or 4, 5, or 6) –d genome –O0 –T xxx –o gff-file transcripts.mfa (or > gff-file instead of the –o option), where xxx denotes the identifier of species-specific parameter set (e.g. Terapod, Eudicoty, etc, listed in the ~/table/gnmtab file).

If you installed spaln in the default setting, you still store the gnomic sequence in the ~/seqdb directory, and format it there. Otherwise, you must specify the location of your formatted genomic sequence with the env variable ‘ALN_DBS’. Then, you can run spaln in any other directory.

Osamu