glygener / glygen.cfde.generator

Java program for the generation of CFDE metadata files from GlyGen data.
GNU General Public License v3.0
0 stars 1 forks source link

Missing Ensemble IDs in CFDE protein dictionary (submission prep tool) #17

Open ReneRanzinger opened 2 years ago

ReneRanzinger commented 2 years ago

There are ~1,500 ensemble IDs that are not accepted by the prep tool right now we need to figure out what is the problem.

ReneRanzinger commented 2 years ago

Reported as ticket HELP-217 to the help desk.

ReneRanzinger commented 2 years ago

Arthur Brady commented:

We had this trouble with a few genes from another submission as well; see this comment and following. In Mano’s case we just removed the references to the mystery genes and submitted their datapackage way.

For your case, we have once again a situation where ENSG00000214826 clearly exists but is nevertheless nowhere to be found in the GFF3s describing the GRCh38.106 release we’re using, including patch files and unplaced genetic sequences.

If by any chance you know where in the distro I can find the missing data, let me know and I’ll update the reference; otherwise I’ll have to recommend doing as we did with Metabolomics, and excising the problem genes for this submission while we track down why they’re not in the annotation release.

Clearly you won’t want to retry submission for each gene: I can help quickly, there, if you send me an updated copy of your current submission.