Closed lazypanda10117 closed 6 years ago
Some of my thoughts off the bat.
[x] GTF should to be reformatted to our Robust Column Selection (see the second version in the comment). It's easy to read and will work well
[x] Review the detailed specification for GTF files, link it and then consider how data in that column can best be highlighted. i.e. feature column is exon
, transcript
.... or a narrow list of keywords, having each of these slightly different shades of a color would be really nice. We can discuss each data-type here if you have any questions to give you a sense of how a biologist would read it. Read through the sam.sublime-syntax
file to see how a systematic syntax works. All the majorly defined tags have explicit use cases.
[x] In the attributes column, highlight the data based on it's type (numeric / string).
[x] The 'Score' column is usually scaled 0-1000; use the 10 gradient coloring scopes to scale accordingly (see wig syntax)
[x] Find some Refseq / Gencode GTF files, maybe some more examples from the internet and see how people are using it and how our syntax highlighting works in different use cases. We want it to be broadly applicable.
Thanks! I will work on that over the weekend.
I have just updated the GTF syntax. Is it better now?
Can you add screenshots please?
Nice; attributes are much clearer like this and the code is very legible, it's obvious what each component does. Care to fill out the appropriate colors for software(source) / chr.Start / chr.End and adding gradient support to the scoring? (Divide 0-1000 into 10 even segments).
What is an appropriate color for software(source)? I have fixed the other issues except for the software color.
Software in bioMonokai is orange italics; sorry all the different themes are not updated at the moment.
It's more about getting all the REGEX engines working (like gradient scoring) correctly and we can worry about particulars of each color and biological class last.
Sweet looking syntax; how do you feel about giving .bed
, .wig
an update too?
Sure, I will try to update them later tonight!
Hey @lazypanda10117 ,
I was testing out the GTF syntax here on some files from different sources and it works great for cufflinks and UCSC gtf files but there's some bugs (skipped syntax) in gencode generated gtf files. That's a pretty widely used standard.
Pull the update and when you get some time can you take a look at the new examples/annot/*.gtf
files I've uploaded and fix up the syntax?
@lazypanda10117 I'm going through the gedit syntaxes and testing them and formatting the headers. The GTF syntax works really well for all the different formats. Good job!
There's a small quirk/bug in that the tabs are being selected around the feature columns and not just the word, which means if there is a background color then the selection is too wide (see below).
Can you re-format the regex to the 'robust column selection' format in gedit along the lines as syntax/gedit/faidx.lang
. It'll make debugging much simpler in the future and correct the flanking tab selection issue.
Note: Make sure to pull the most recent changes to bioKate.xml for the updated color schemes.
Hi all, I am trying to complete the GTF syntax and port it over to gedit. (@Ebedthan, @ababaian). As suggested in issue 14, this is near-complete, so I want to know what do I need to fix for the sublime version first, and then following that to port it over to gedit. Thank you.