couldn't import protein sequence and correctly display - Githubissues

TeselaGen / openVectorEditor

DEPRECATED - Teselagen's Open Source Vector/Plasmid Editor Component

https://teselagen.github.io/tg-oss/ove/#/Editor

MIT License

199 stars 71 forks source link

couldn't import protein sequence and correctly display #912

Closed mega-bisharp closed 1 year ago

mega-bisharp commented 1 year ago

my protein sequence from ncbi

AXN76052.1 MRCMSELVVFKANELAISRYDLTEHETKLILCCVALLNPTIENPTRKERTVSFTYNQYAQMMNISRENAYGVLAKATRELMTRTVEIRNPLVKGFEIFQWTNYAKFSSEKLELVFSEEILPYLFQLKKFIKYNLEHVKSFENKYSMRIYEWLLKELTQKKTHKANIEISLDEFKFMLMLENNYHEFKRLNQWVLKPISKDLNTYSNMKLVVDKRGRPTDTLIFQVELDRQMDLVTELENNQIKMNGDKIPTTITSDSHLHNGLRKTLHDALTAKIQLTSFEAKFLSDMQSKYDLNGSFSWLTQKQRTTLENILAKYGRI result:

problem: I think the type in here maybe "PROTEIN", not "DNA"

@tnrich

tnrich commented 1 year ago

@mega-bisharp can you please attach the file as a ZIP file here ? Thanks!

mega-bisharp commented 1 year ago

@mega-bisharp can you please attach the file as a ZIP file here ? [Thanks!] This is my protein sequences, thank you for your reply! seqdump.zip @tnrich

tnrich commented 1 year ago

@mega-bisharp that file doesn't have a file extension that would indicate that it is a protein. We would need to guess based on the sequence content which can sometimes be risky..

tnrich commented 1 year ago

Also this is the new repo that OVE lives in - https://github.com/TeselaGen/tg-oss

mega-bisharp commented 1 year ago

@mega-bisharp该文件没有表明它是蛋白质的文件扩展名。我们需要根据序列内容进行猜测，这有时可能是有风险的。

So, what's the correctly file extension for protein sequence? Thank you for your answer！ @tnrich

tnrich commented 1 year ago

@mega-bisharp I believe the format you're looking for is .faa since your data is in the fasta format:

I'll actually need to update the code here https://github.com/TeselaGen/tg-oss/ (that's the new repo for ove/bio-parsers, this on is deprecated now) in order to handle .faa files correctly. I'll do that now

tnrich commented 1 year ago

@mega-bisharp ok, I've updated @teselagen/ove to v0.3.11 which should include automatic parsing of .faa files to protein. Let me know if that works for you :)