andreaminio / AnnotationPipeline-EVM_based-DClab

Cantù Lab @ UC Davis - Annotation pipeline - EVM based
13 stars 10 forks source link

Question about merging Stringtie, Trinity(on genome), Trinity(de novo) results #6

Closed yaoxkkkkk closed 6 months ago

yaoxkkkkk commented 6 months ago

Thank you for your innovation on this annotation pipeline. I am carrying out 0.2.2 - RNAseq assembly part of this pipeline. I have finished the three methods and I got some gff3 files, I wonder that what I can do to merge the three part results and move forward to the final step 0.2.2.4 - Export useful files?

yaoxkkkkk commented 6 months ago

More question:

cat ${name}.trinity.og.on.genome.cov_iden_g95.fasta.transdecoder.gff3 | awk '$7!="-"' > ${name}.trinity.og.on.genome.cov_iden_g95.fasta.transdecoder.no_minus.gff3

Why the genes located on minus strand need to be eliminated?

andreaminio commented 6 months ago

Hi Yao,

  1. You can move to 0.2.2.4 as long as you finish all the necessary/possible assembly procedures. The code in that section should work no matter how many assemblies you did, and should be fine afterwards.
  2. Transdecoder is not run on genes, but on transcripts sequences in that phase. As the "negative" strand of the mRNA transcript is not translated, we remove the call as potential noise.
yaoxkkkkk commented 6 months ago

Glad to receive your instant reply! I am tring to follow process this section to the final part, but I still met a minor bug about GFF_extract_features.py. I have created a new environment which python version is Python 2.7.18 :: Anaconda, Inc.. Here is the error message:

$ python GFF_extract_features.py 
Traceback (most recent call last):
  File "GFF_extract_features.py", line 7, in <module>
    import pandas as pd
  File "/dssg/home/acct-jiang.lu/jiang.lu/anaconda3/envs/py2/lib/python2.7/site-packages/pandas/__init__.py", line 13, in <module>
    __import__(dependency)
  File "/dssg/home/acct-jiang.lu/jiang.lu/anaconda3/envs/py2/lib/python2.7/site-packages/dateutil/__init__.py", line 5, in <module>
    from ._version import version as __version__
  File "/dssg/home/acct-jiang.lu/jiang.lu/anaconda3/envs/py2/lib/python2.7/site-packages/dateutil/_version.py", line 10
    version: str
           ^
SyntaxError: invalid syntax
yaoxkkkkk commented 6 months ago

Oh I figure it out, it's due to the way of installing the package. I installed the pandas using pip, then here is the error message. Then I use anaconda to install pandas, the problem solved!