Closed blackFirefly closed 7 years ago
I can take a look if you can send me "lifted.gff3".
That would be great! Since the file has a size of around 30MB, I sent you a dropbox link to the email adress stated in your profile.
I am seeing the same problem as well... need anymore test data?
I think the specific issue with these parents not being defined happens due to them being in the unlifted file
For example I had child features with Parent=SP_0.1_T008586-R3 in lifted.gff3 but then unlifted.gff3 had the actual parent where ID=PKINGS_0.1_T008586-R3
That is expected as liftOver reads gff line by line and not the transcript as a whole. flo's process_gff method tries to fix such inconsistencies in liftOver's output. So the final output from flo should not be an invalid gff.
On 27 Jun 2017, at 22:20, Colin Diesh notifications@github.com wrote:
I think the specific issue with these parents not being defined happens due to them being in the unlifted file
For example I had child features with Parent=SP_0.1_T008586-R3 in lifted.gff3 but then unlifted.gff3 had the actual parent where ID=PKINGS_0.1_T008586-R3
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Ah...I think I remember at one point writing a script to synthesize a parent features for features without parents for something like this...is that what process_gff does?
That, and eliminating transcripts that mapped partly to different scaffolds.
On 28-Jun-2017, at 12:09 AM, Colin Diesh notifications@github.com wrote:
Ah...I think I remember at one point writing a script to synthesize a parent features for features without parents for something like this...is that what process_gff does?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/wurmlab/flo/issues/12#issuecomment-311511139, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFhBewAkK8sd0zhbCArJP81gDicmpKOks5sIYuogaJpZM4N5hgT.
Gotcha...I was considering maybe using crossmap, but it looks like it has the same issue
Maybe need to convert from gff to something else, bed12 or similar
@cmdcolin:
I am seeing the same problem as well... need anymore test data?
There was a bug. I have made some changes. Can you give it a spin?
@blackFirefly - please see my email
@yeban I believe it is working better now, it now gets to the genometools stage, but the genometools ends up crashing
Could maybe ask their team about it, error message isn't easy to interpret
$ rake
mkdir annotations.gff-liftover-target
liftOver -gff annotations.gff run/liftover.chn annotations.gff-liftover-target/lifted.gff3 annotations.gff-liftover-target/unlifted.gff3
Reading liftover chains
Mapping coordinates
WARNING: -gff is not recommended.
Use 'ldHgGene -out=<file.gp>' and then 'liftOver -genePred <file.gp>'
/home/me/flo/gff_recover.rb annotations.gff-liftover-target/lifted.gff3 | gt gff3 -tidy -sort -addids -retainids - > annotations.gff-liftover-target/annotations.gff-liftover-target.gff3
warning: line 1 in file "-" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically
Assertion failed: (elemidx >= q->front), function gt_queue_remove, file src/core/queue.c, line 135.
This is a bug, please report it at
https://github.com/genometools/genometools/issues
Please make sure you are running the latest release which can be found at
http://genometools.org/pub/
You can check your version number with `gt -version`.
Aborted (core dumped)
/home/me/flo/gff_recover.rb:60:in `write': Broken pipe @ io_write - <STDOUT> (Errno::EPIPE)
from /home/me/flo/gff_recover.rb:60:in `puts'
from /home/me/flo/gff_recover.rb:60:in `puts'
from /home/me/flo/gff_recover.rb:60:in `<main>'
rake aborted!
At least one thing that could be suspicious is that there are still lines that exist without parents. If I save the file from
gff_recover.rb annotations.gff-liftover-target/lifted.gff3 > out.gff
then out.gff (first feature in file) has an mRNA that references a parent gene that is not in out.gff
@blackFirefly's problem was partly flo and partly the gff. The former is now fixed.
@cmdcolin I can't be sure what the problem is without looking at the input / lifted gff. Please could you open a new issue with test data?
I tried flo yesterday, but it ended up in an error. It seems like there is a problem in a temorary GFF file? So the question is if the program or my input GFF is the problem?
It created a file called "lifted.gff3" and one called "unlifted.gff3". Both of them are filled. But there is also a third file "Aarabicum.v2.5.gff-liftover-aethionema-arabicum_v3.0.fasta.gff3" which is empty.
Here are the last lines flo printed: