ogotoh / spaln

Genome mapping and spliced alignment of cDNA or amino acid sequences
GNU General Public License v2.0
94 stars 16 forks source link

Almost all spaln mRNA without the final stop codon #48

Open xiekunwhy opened 2 years ago

xiekunwhy commented 2 years ago

Hi,

I found that almost all spaln mRNA without the final stop codon, is there a way to include the final stop codon?

here are my command lines:

makeidx.pl -inp Op-f.gf spaln -t20 -M4 -Q7 -O0 -LS -ya012 -o all_aa.gff3 -d Op-f all.aa

when I use gffread to extract cds sequence from all_aa.gff3, I found that only a few cds sequence with stop codon, only 10/50000.

Best, Kun

ogotoh commented 2 years ago

Dear Kun,

Thank you for your comment. I am preparing a revised version, in which termination codon will be included in outputs of all formats with a protein query. Please wait few days until the next release.

Osamu,

xiekunwhy commented 2 years ago

Wow, I am waiting online.

xiekunwhy commented 2 years ago

Hi Osamu,

Please also update to bioconda when next release is out.

Best wishes, Kun

ogotoh commented 2 years ago

Dear Kun,

I have just uploaded the new version ver.2.4.8. Please confirm that the results are what you want.

I actually don't know how to update bioconda. I will ask someone how to do it.

xiekunwhy commented 2 years ago

Dear Osamu,

The results are exactly what I want, thank you very much.

And I still have some other questions to trouble you. 1) gffread found that some coordinates are wrong (end<start!), not too many, I can ignore these mRNAs, but I think I need to let you know. The following is the log of gffread. all_aa.spaln1.pep.log.txt

2) annotating genomes will be my daily works in next few month, and can you share me the scripts and command lines you have used to generate species-specific parameters, some species I am annotating are mollusc species. I found that you have told someone how to do that (https://github.com/ogotoh/spaln/issues/39), but some part are missing. I may try my best to understand all scripts and command lines if you can share to me.

Best regards, Kun

ogotoh commented 2 years ago

Dear Kun,

Thank you for the examples of wrong coordinates. I will scrutinize the cases to improve spaln’s performance.

I am just developing a scheme by which species-specific parameter sets are obtained more easily than the present scripts do. If the genomic assembly is of high quality and a sufficiently large number of transcript sequences are available, it would not be a hard task. I am trying to get a reliable parameter set when either of the conditions are unsatisfied. Although the final goal would not be reached soon, I will make the developing scripts and additional C++ codes publicly available in a couple of weeks after examination of their integrity.

Osamu,