fmfi-genomika / genomikaMalSym

3 stars 0 forks source link

(E) Augustus (slow, needs A) #3

Closed JakubNvk closed 6 years ago

JakubNvk commented 6 years ago
rasto2211 commented 6 years ago

Used ustilago_maydis model (fungi) since there's no MalSym model. From augustus docs:

Note that for closely related species usually only one version is necessary. For example, the human version is good for all mammals.
rasto2211 commented 6 years ago

Output from gtfToGenePred

augustus.gtf doesn't appear to be a GTF file (GFF not supported by this program)
rasto2211 commented 6 years ago

It seems that we need to remove lines which have transcript and gene in the third column.

cat augustus.gtf | grep -v "^#" | grep -Pv "\t(gene|transcript)\t" > augustus2.gtf

Transcript and gene lines have data in the 9th column in a format which does not agree with GTF spec.

rasto2211 commented 6 years ago

Convert GTF to GenePredExt:

gtfToGenePred -genePredExt augustus2.gtf augustus.genePredExt

Load data from GenePredExt file to augustus SQL table:

hgLoadGenePred -genePredExt malSym1 augustus augustus.genePredExt
rasto2211 commented 6 years ago

No need to modify /kentsrc/trackDb/malSym/malSym1/trackDb.ra since augustus track is already defined in /kentsrc/trackDb/trackDb.ra.

izri16 commented 6 years ago

@rasto2211 I suppose that it would be better to write those approaches to wiki rather than store them here.

rasto2211 commented 6 years ago

I looked at augustus track in genome browser and it seems quite reasonable.

I also added docs to github wiki.