Open MattHuff opened 2 years ago
I have confirmed that this was due to the GTF file I provided not including exon_number
as a feature; after generating a GTF file containing this information, I was able to get the build script working with my genome.
However, my concern now is that none of my detected splice variants appear to have any coding variants. Even with a feature such as "exon skipping," all of the identifed variants are marked as "non-coding." I provided CDS and AA information when building my database, which is consistent with the GFF3 and GTF files I have provided throughout the study. I have tried renaming the CDS and AA sequences in the provided FASTA files, but that hasn't fixed the issue. I understand this is an unprecedented use of Bisbee, but I want to see if there are comparable issues in using default Ensembl.
I have used Bisbee to call significant alternative splicing events, and I would like to provide sequences for these alternatively spliced proteins. While the default scripts of Bisbee do not allow for use of organisms not available on Ensembl, the script uses
pyensembl
, which includes commands to load outside data into pyensembl's database. I have created my own copy of thebuild.py
script designed to load my organism and get used as a pyensembl object. However, the most recent updates give me the following error:This is the command that was run:
And here is the command that was added to my custom copy of
build.py
:line 81
line 98
If this is an issue on the issue of pyensembl, please let me know.