Closed olechnwin closed 1 year ago
Hi Cen,
Use the default "-s ensembl_merge". The custom field re-arrangement can only be used if all lines have the same number of ID fields which is not the case in your file.
Thank you, Richard
Hi Richard,
Thank you so much for your help. As always truly appreciate you spending the time to reply.
Best, Cen
Hi Richard,
I'm very sorry. I meant to post it here. My bad. I'm going to delete that other post. To keep it in the same thread, here is the image where the novel transcript disappear after filtering. As shown below, the first two tracks are the merged annotations, and the third track which is after filter is missing the G50504. Is there a way to keep G50504?
I was running the same exact command with "-s ensembl_merge"
python ~/opt/tama/tama_go/format_converter/tama_format_gff_to_bed12_liftoff.py ${gff_dir}/${gff_name} ${gff_name/.gf
f3/.bed}
python ~/opt/tama/tama_merge.py -f filelist.txt -d merge_dup -p merged_annos_a673_2 -s gencode
python ~/opt/tama/tama_go/format_converter/tama_format_id_filter.py -b merged_annos_a673_2.bed \
-o merged_annos_a673_2_filt.bed
python ~/opt/tama/tama_go/format_converter/tama_convert_bed_gtf_ensembl_no_cds.py \
merged_annos_a673_2_filt.bed merged_annos_a673_2_filt.gtf
Hi Cen,
If you load all the bed and GTF files along the way to do see where the novel models drop out?
Also when you show the genome browser view the next time can you make sure all tracks are in expanded mode?
Thank you, Richard
Hi Richard,
Here is all the bed and GTF files. Turns out, the novel transcript was dropped during the conversion to GTF. Also, do you happen to know why the transcripts are in different colors ?
Thank you so much! Cen
edit: adding filelist and step of processing.
Hi Cen,
Sorry but could you annotate the image to indicate which track is showing which step of processing?
Thank you, Richard
Hi Richard, I have updated the figure above with the processing steps. Thank you, Cen
Hi Cen,
Ok I see the problem now. I have fixed the bug here for tama_format_id_filter.py. Could you update and try the new version?
The problem is that TAMA was not adding the TAMA ID's to the first 2 ID subfields so it was not being recognized by the GTF convertor.
As for the different coloured transcript models, that is a feature of TAMA Merge using the bed file to be able to show the source of origin by the colour. You can read about this in the wiki TAMA Merge page.
Thank you, Richard
Hi Richard,
Thank you so much for quickly fixing the problem. I will try the new version. It'll take me a while to try it though, as our HPC cluster has been swamped lately.
Thank you, Cen
Hi Richard,
I have tried the new version. It works! The missing novel transcript is now in the GTF file.
Thank you so much! Cen
Hi Cen,
Glad it's working for you now!
Thanks for using TAMA! Richard
Hi Richard, I was trying to re-arrange the ID line of my bed file that was generated by merging a liftoff and Iso-Seq annotations:
This is the error I got:
Here is an example lines from merged_annos_a673_2:
How do I fix this error?
Thank you for your help! Cen