Closed zhuojiuqingyun closed 4 days ago
Please double check and make sure the path is correct:
FileNotFoundError: [Errno 2] No such file or directory: 'data/xenia.proteins.gene_symbol_to_protein_ID.json'
The error is due to no file existing at that location.
Please double check and make sure the path is correct:
FileNotFoundError: [Errno 2] No such file or directory: 'data/xenia.proteins.gene_symbol_to_protein_ID.json'
The error is due to no file existing at that location.
Thanks for your reply.
In fact, it's this error which led to data/xenia.proteins.gene_symbol_to_protein_ID.json not being generated.
TypeError: descriptor 'union' of 'set' object needs an argument
I think the reason is that I couldn't find the protein fasta from ensembl, so I downloaded from NCBI. So my protein fasta description doesn't contain gene symbol and protein id ,which caused the error above.
How can I generate protein embedding from protein fasta not downloaded from Ensembl? Could you give me some instructions?
Thanks a lot!
Ruijie
Moreover, I want to use SATURN for Xenia to cross species annotate,but we don't know the exact gene_symbol of related protein_id. Will the embedding created influnce the training of SATURN? For example, the description of my protein fasta and part of gtf are as follows.
>Xe_029168-T1 Xe_029168
MSSTEEEVEFDIEYIATEVQPYMFEPLASSNNVETDEDLSSSSSTDSSSDEYTHRIGNTNWCECGHCVAMTTGRESICCHEEPKTDPKIHGDHLCIT
HiC_scaffold_1 GenBank CDS 333245 333499 . - 0 transcript_id "Xe_000054-T1"; gene_id "Xe_000054";
Are the exact gene_symbols necessary? Or they can be replaced by those serial numbers.
Could you give me some advices? Thank you very much!
Dear authors, Thank you very much for making this great tool. I got the following error while running Generate Protein Embeddings.ipynb using my protein fasta.
I didn't encounter this error when I running the example Xenopus_tropicalis. Could you help me with my issue?
Yours sincerely. Ruijie