Closed ghost closed 6 years ago
I tested it out on a made up fusion:
agfusion annotate -g5 SSX2 -j5 52698786 -g3 FGFR2 -j3 121564577 -db agfusion.homo_sapiens.87.db -o test
It does provide annotation for SSX2. See below:
It could be that your fusion does not contain any of the annotated protein domains. Could also be that you're using an older version of Ensembl that does not contain annotation information for SSX2.
Can you give me the command you're using?
And to answer your question. Yes, you could insert the domain information manually into the AGFusion DB. Just requires some SQLite skills. You can look in the "database.py" file in the "fetch_protein_annotation" function to see how I insert domain information into the database.
I should also add I don't want to make any changes to the current AGFusion databases unless the changes are pertinent to the respective ensembl release. For example, if ensembl has no protein domain information for SSX2 for release 80 then I don't want to add it in manually.
Sorry should have included that - here's what I'm using:
agfusion annotate --gene5prime ATRX --gene3prime SSX2 --junction5prime 77785846 --junction3prime 52700543 -db agfusion.homo_sapiens.87.db -o ATRX-SSX2 --WT
Totally makes sense to me about not adding it if isn't in the release.
If I look at the SSX2 wild-type generated by agfusion, it has the KRAB domain, but not the SSXRD domain/motif. Note that when agfusion produces the fusion or wild type for SSX1 it does have this domain, and from interpro etc I'm seeing that SSXRD is at the end of SSX2: http://www.ebi.ac.uk/interpro/protein/Q16385
I see what you mean, but it seems the "SSXRD" domain/motif is just not listed in any of the protein annotation databases for ensembl release 87. I am going to update AGFusion to support up to ensembl release 91. I tested it out on 91 but still dod not see SSXRD listed. If you look at the ensembl page for SSX2, SSXRD is also not listed for the protein:
I can consider adding a feature to AGFusion so the user can manually include protein annotations as a flag. If it would be useful for you? For example:
agfusion annotate \\
--gene5prime ATRX \\
--gene3prime SSX2 \\
--junction5prime 77785846 \\
--junction3prime 52700543 \\
-db agfusion.homo_sapiens.87.db \\
-o ATRX-SSX2 \\
--WT \\
--add-domain "SSX2:SSXRD:158-187"
An addition note. I know I could have AGFusion incorporate more protein annotation sources, but I just don't have the time right now to do so.
Thank you for investigating and that feature would be great but is certainly not necessary, completely understand about the time constraints.
Will provide a feature to add custom domains in future release.
Hello I noticed that it appears that there is not domain information in the agfusion DB for some genes e.g. SSX2. Is that correct and is there a way to get that into my agfusion database?