soedinglab / hh-suite

Remote protein homology detection suite.
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3019-7
GNU General Public License v3.0
545 stars 134 forks source link

cif2fasta.py (3.2.0) incorrectly generate protein description when cif file contain multiple fasta #236

Open igortru opened 3 years ago

igortru commented 3 years ago

code :
protein_description = struct.getValue('pdbx_descriptor') protein_description = protein_description.replace('\n', ' ') protein_description = protein_description.replace(';', ' ') # to prevent parsing errors

        if len(protein_description.split(' ')) >= 5:
            protein_description = ' '.join(protein_description.split(' ')[0:5]) # maximum of 5 words in header

cif :

_struct.pdbx_descriptor ;30S ribosomal protein S2, 30S ribosomal protein S3, 30S ribosomal protein S4, 30S ribosomal protein S5, 30S ribosomal protein S6, 30S ribosomal protein S7, 30S ribosomal protein S8, 30S ribosomal protein S9, 30S ribosomal protein S10, 30S ribosomal protein S11, 30S ribosomal protein S12, 30S ribosomal protein S13, 30S ribosomal protein S14 type Z, 30S ribosomal protein S15, 30S ribosomal protein S16, 30S ribosomal protein S17, 30S ribosomal protein S18, 30S ribosomal protein S19, 30S ribosomal protein S20, 30S ribosomal protein S21, Ribosomal subunit interface protein, 50S ribosomal protein L2, 50S ribosomal protein L3, 50S ribosomal protein L4, 50S ribosomal protein L5, 50S ribosomal protein L6, 50S ribosomal protein L13, 50S ribosomal protein L14, 50S ribosomal protein L15, 50S ribosomal protein L16, 50S ribosomal protein L17, 50S ribosomal protein L18, 50S ribosomal protein L19, 50S ribosomal protein L20, 50S ribosomal protein L21, 50S ribosomal protein L22, 50S ribosomal protein L23, 50S ribosomal protein L24, 50S ribosomal protein L25, 50S ribosomal protein L27, 50S ribosomal protein L28, 50S ribosomal protein L29, 50S ribosomal protein L30, 50S ribosomal protein L31 type B, 50S ribosomal protein L32, 50S ribosomal protein L33, 50S ribosomal protein L34, 50S ribosomal protein L35, 50S ribosomal protein L36/RNA Complex

how hhpred see it in PDB70 NAME 5NGM_Au 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_Ar 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_AY 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_AU 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_Am 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_Ak 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_Av 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_AT 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} NAME 5NGM_Ab 30S ribosomal protein S2, 30S; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus

igortru commented 3 years ago

5NGM_Ab 30S ribosomal protein S2; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAVISMKQLLEAGVHFGHQTRRWNPKMKKYIFTERNGIYIIDLQKTVKKVDEAYNFLKQVSEDGGQVLFVGTKKQAQESV KSEAERAGQFYINQRWLGGLLTNYKTISKRIKRISEIEKMEEDGLFEVLPKKEVVELKKEYDRLIKFLGGIRDMKSMPQA LFVVDPRKERNAIAEARKLNIPIVGIVDTNCDPDEIDYVIPANDDAIRAVKLLTAKMADAILEGQQGVSNEEVAAEQNID LDEKEKSEETEATEE 5NGM_Ac 30S ribosomal protein S3; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MGQKINPIGLRVGIIRDWEAKWYAEKDFASLLHEDLKIRKFIDNELKEASVSHVEIERAANRINIAIHTGKPGMVIGKGG SEIEKLRNKLNALTDKKVHINVIEIKKVDLDARLVAENIARQLENRASFRRVQKQAITRAMKLGAKGIKTQVSGRLGGAD IARAEQYSEGTVPLHTLRADIDYAHAEADTTYGKLGVKVWIYRGEVLPTKNTSGGGK 5NGM_Ad 30S ribosomal protein S4; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MARFRGSNWKKSRRLGISLSGTGKELEKRPYAPGQHGPNQRKKLSEYGLQLREKQKLRYLYGMTERQFRNTFDIAGKKFG VHGENFMILLASRLDAVVYSLGLARTRRQARQLVNHGHILVDGKRVDIPSYSVKPGQTISVREKSQKLNIIVESVEINNF VPEYLNFDADSLTGTFVRLPERSELPAEINEQLIVEYYSR 5NGM_Ae 30S ribosomal protein S5; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MARREEETKEFEERVVTINRVAKVVKGGRRFRFTALVVVGDKNGRVGFGTGKAQEVPEAIKKAVEAAKKDLVVVPRVEGT TPHTITGRYGSGSVFMKPAAPGTGVIAGGPVRAVLELAGITDILSKSLGSNTPINMVRATIDGLQNLKNAEDVAKLRGKT VEELYN 5NGM_Af 30S ribosomal protein S6; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MRTYEVMYIVRPNIEEDAKKALVERFNGILATEGAEVLEAKDWGKRRLAYEINDFKDGFYNIVRVKSDNNKATDEFQRLA KISDDIIRYMVIREDEDK 5NGM_Ag 30S ribosomal protein S7; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MPRKGSVPKRDVLPDPIHNSKLVTKLINKIMLDGKRGTAQRILYSAFDLVEQRSGRDALEVFEEAINNIMPVLEVKARRV GGSNYQVPVEVRPERRTTLGLRWLVNYARLRGEKTMEDRLANEILDAANNTGGAVKKREDTHKMAEANKAFAHYRW 5NGM_Ah 30S ribosomal protein S8; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MTMTDPIADMLTRVRNANMVRHEKLELPASNIKKEIAEILKSEGFIKNVEYVEDDKQGVLRLFLKYGQNDERVITGLKRI SKPGLRVYAKASEMPKVLNGLGIALVSTSEGVITDKEARKRNVGGEIIAYVW 5NGM_Ai 30S ribosomal protein S9; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MTLAQVEYRGTGRRKNSVARVRLVPGEGNITVNNRDVREYLPFESLILDLNQPFDVTETKGNYDVLVNVHGGGFTGQAQA IRHGIARALLEADPEYRGSLKRAGLLTRDPRMKERKKPGLKAARRSPQFSKR 5NGM_Aj 30S ribosomal protein S10; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAKQKIRIRLKAYDHRVIDQSAEKIVETAKRSGADVSGPIPLPTEKSVYTIIRAVHKYKDSREQFEQRTHKRLIDIVNPT PKTVDALMGLNLPSGVDIEIKL 5NGM_Ak 30S ribosomal protein S11; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MARKQVSRKRRVKKNIENGVAHIRSTFNNTIVTITDEFGNALSWSSAGALGFKGSKKSTPFAAQMASETASKSAMEHGLK TVEVTVKGPGPGRESAIRALQSAGLEVTAIRDVTPVPHNGCRPPKRRRV 5NGM_Al 30S ribosomal protein S12; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MPTINQLVRKPRQSKIKKSDSPALNKGFNSKKKKFTDLNSPQKRGVCTRVGTMTPKKPNSALRKYARVRLSNNIEINAYI PGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRGALDTSGVDGRRQGRSLYGTKKPKN 5NGM_Am 30S ribosomal protein S13; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MARIAGVDIPREKRVVISLTYIYGIGTSTAQKILEEANVSADTRVKDLTDDELGRIREVVDGYKVEGDLRRETNLNIKRL MEISSYRGIRHRRGLPVRGQKTKNNARTRKGPVKTVANKKK 5NGM_An 30S ribosomal protein S14 type Z; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAKTSMVAKQQKKQKYAVREYTRCERCGRPHSVYRKFKLCRICFRELAYKGQIPGVRKASW 5NGM_Ao 30S ribosomal protein S15; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAISQERKNEIIKEYRVHETDTGSPEVQIAVLTAEINAVNEHLRTHKKDHHSRRGLLKMVGRRRHLLNYLRSKDIQRYRE LIKSLGIRR 5NGM_Ap 30S ribosomal protein S16; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAVKIRLTRLGSKRNPFYRIVVADARSPRDGRIIEQIGTYNPTSANAPEIKVDEALALKWLNDGAKPTDTVHNILSKEGI MKKFDEQKKAK 5NGM_Aq 30S ribosomal protein S17; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MSERNDRKVYVGKVVSDKMDKTITVLVETYKTHKLYGKRVKYSKKYKTHDENNSAKLGDIVKIQETRPLSATKRFRLVEI VEESVII 5NGM_Ar 30S ribosomal protein S18; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAGGPRRGGRRRKKVCYFTANGITHIDYKDTELLKRFISERGKILPRRVTGTSAKYQRMLTTAIKRSRHMALLPYVKEEQ 5NGM_As 30S ribosomal protein S19; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MARSIKKGPFVDEHLMKKVEAQEGSEKKQVIKTWSRRSTIFPNFIGHTFAVYDGRKHVPVYVTEDMVGHKLGEFAPTRTF KGHVADDKKTRR 5NGM_At 30S ribosomal protein S20; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MANIKSAIKRVKTTEKAEARNISQKSAMRTAVKNAKTAVSNNADNKNELVSLAVKLVDKAAQSNLIHSNKADRIKSQLMT ANK 5NGM_Au 30S ribosomal protein S21; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MSKTVVRKNESLEDALRRFKRSVSKSGTIQEVRKREFYEKPSVKRKKKSEAARKRKFK 5NGM_Av Ribosomal subunit interface protein; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MIRFEIHGDNLTITDAIRNYIEEKIGKLERYFNDVPNAVAHVKVKTYSNSATKIEVTIPLKNVTLRAEERNDDLYAGIDL INNKLERQVRKYKTRINRKSRDRGDQEVFVAELQEMQETQVDNDAYDDNEIEIIRSKEFSLKPMDSEEAVLQMNLLGHDF FVFTDRETDGTSIVYRRKDGKYGLIQTSEQ 5NGM_AC 50S ribosomal protein L2; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAIKKYKPITNGRRNMTSLDFAEITKTTPEKSLLKPLPKKAGRNNQGKLTVRHHGGGHKRQYRVIDFKRNKDGINAKVDS IQYDPNRSANIALVVYADGEKRYIIAPKGLEVGQIVESGAEADIKVGNALPLQNIPVGTVVHNIELKPGKGGQIARSAGA SAQVLGKEGKYVLIRLRSGEVRMILSTCRATIGQVGNLQHELVNVGKAGRSRWKGIRPTVRGSVMNPNDHPHGGGEGRAP IGRPSPMSPWGKPTLGKKTRRGKKSSDKLIVRGRKKK 5NGM_AD 50S ribosomal protein L3; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MTKGILGRKIGMTQVFGENGELIPVTVVEAKENVVLQKKTVEVDGYNAIQVGFEDKKAYKKDAKSNKYANKPAEGHAKKA DAAPKRFIREFRNVDVDAYEVGQEVSVDTFVAGDVIDVTGVSKGKGFQGAIKRHGQSRGPMSHGSHFHRAPGSVGMASDA SRVFKGQKMPGRMGGNTVTVQNLEVVQVDTENKVILVKGNVPGPKKGLVEIRTSIKKGNK 5NGM_AE 50S ribosomal protein L4; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MANYDVLKLDGTKSGSIELSDAVFGIEPNNSVLFEAINLQRASLRQGTHAVKNRSAVSGGGRKPWKQKGTGRARQGTIRA PQWRGGGIVFGPTPRSYAYKMPKKMRRLALRSALSFKAQENGLTVVDAFNFEAPKTKEFKNVLSTLEQPKKVLVVTENED VNVELSARNIPGVQVTTAQGLNVLDITNADSLVITEAAAKKVEEVLG 5NGM_AF 50S ribosomal protein L5; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MNRLKEKFNTEVTENLMKKFNYSSVMEVPKIDKIVVNMGVGDAVQNSKVLDNAVEELELITGQKPLVTKAKKSIATFRLR EGMPIGAKVTLRGERMYEFLDKLISVSLPRVRDFQGVSKKAFDGRGNYTLGVKEQLIFPEIDYDKVSKVRGMDIVIVTTA NTDEEARELLANFGMPFRK 5NGM_AG 50S ribosomal protein L6; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MSRVGKKIIDIPSDVTVTFDGNHVTVKGPKGELSRTLNERMTFKQEENTIEVVRPSDSKEDRTNHGTTRALLNNMVQGVS QGYVKVLELVGVGYRAQMQGKDLILNVGYSHPVEIKAEENITFSVEKNTVVKVEGISKEQVGALASNIRSVRPPEPYKGK GIRYQGEYVRRKEGKTGK 5NGM_AH 50S ribosomal protein L13; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MRQTFMANESNIERKWYVIDAEGQTLGRLSSEVASILRGKNKVTYTPHVDTGDYVIVINASKIEFTGNKETDKVYYRHSN HPGGIKSITAGELRRTNPERLIENSIKGMLPSTRLGEKQGKKLFVYGGAEHPHAAQQPENYELRG 5NGM_AI 50S ribosomal protein L14; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MIQQETRLKVADNSGAREVLTIKVLGGSGRKTANIGDVIVCTVKNATPGGVVKKGDVVKAVIVRTKSGVRRNDGSYIKFD ENACVIIRDDKGPRGTRIFGPVARELREGNFMKIVSLAPEVL 5NGM_AJ 50S ribosomal protein L15; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MKLHELKPAEGSRKERNRVGRGVATGNGKTSGRGHKGQKARSGGGVRPGFEGGQLPLFRRLPKRGFTNINRKEYAIVNLD QLNKFEDGTEVTPALLVESGVVKNEKSGIKILGNGSLDKKLTVKAHKFSASAAEAIDAKGGAHEVI 5NGM_AK 50S ribosomal protein L16; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MLLPKRVKYRRQHRPKTTGRSKGGNYVTFGEFGLQATTTSWITSRQIESARIAMTRYMKRGGKVWIKIFPHTPYTKKPLE VRMGAGKGAVEGWIAVVKPGRILFEVAGVSEEVAREALRLASHKLPVKTKFVKREELGGETNES 5NGM_AL 50S ribosomal protein L17; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MGYRKLGRTSDQRKAMLRDLATSLIISERIETTEARAKEVRSVVEKLITLGKKGDLASRRNAAKTLRNVEILNEDETTQT ALQKLFGEIAERYTERQGGYTRILKQGPRRGDGAESVIIELV 5NGM_AM 50S ribosomal protein L18; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MISKIDKNKVRLKRHARVRTNLSGTAEKPRLNVYRSNKHIYAQIIDDNKGVTLAQASSKDSDIATTATKVELATKVGEAI AKKAADKGIKEIVFDRGGYLYHGRVKALAEAARESGLEF 5NGM_AN 50S ribosomal protein L19; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MTNHKLIEAVTKSQLRTDLPSFRPGDTLRVHVRIIEGTRERIQVFEGVVIKRRGGGVSETFTVRKISSGVGVERTFPLHT PKIEKIEVKRRGKVRRAKLYYLRSLRGKAARIQEIR 5NGM_AO 50S ribosomal protein L20; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MPRVKGGTVTRARRKKTIKLAKGYFGSKHTLYKVAKQQVMKSGQYAFRDRRQRKRDFRKLWITRINAAARQHEMSYSRLM NGLKKAGIDINRKMLSEIAISDEKAFAQLVTKAKDALK 5NGM_AP 50S ribosomal protein L21; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MFAIIETGGKQIKVEEGQEIFVEKLDVNEGDTFTFDKVLFVGGDSVKVGAPTVEGATVTATVNKQGRGKKITVFTYKRRK NSKRKKGHRQPYTKLTIDKINA 5NGM_AQ 50S ribosomal protein L22; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MEAKAVARTIRIAPRKVRLVLDLIRGKNAAEAIAILKLTNKASSPVIEKVLMSALANAEHNYDMNTDELVVKEAYANEGP TLKRFRPRAQGRASAINKRTSHITIVVSDGKEEAKEA 5NGM_AR 50S ribosomal protein L23; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MEARDILKRPVITEKSSEAMAEDKYTFDVDTRVNKTQVKMAVEEIFNVKVASVNIMNYKPKKKRMGRYQGYTNKRRKAIV TLKEGSIDLFN 5NGM_AS 50S ribosomal protein L24; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MHIKKGDNVKVIAGKDKGKEGKVIATLPKKDRVVVEGVNIMKKHQKPTQLNPEGGILETEAAIHVSNVQLLDPKTNEPTR VGYKFVDGKKVRIAKKSGEEIKSNN 5NGM_AT 50S ribosomal protein L25; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MASLKSIIRQGKQTRSDLKQLRKSGKVPAVVYGYGTKNVSVKVDEVEFIKVIREVGRNGVIELGVGSKTIKVMVADYQFD PLKNQITHIDFLAINMSEERTVEVPVQLVGEAVGAKEGGVVEQPLFNLEVTATPDNIPEAIEVDITELNINDSLTVADVK VTGDFKIENDSAESVVTVVAPTEEPTEEEIEAMEGEQQTEEPEVVGESKEDEEKTEE 5NGM_AU 50S ribosomal protein L27; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MLKLNLQFFASKKGVSSTKNGRDSESKRLGAKRADGQFVTGGSILYRQRGTKIYPGENVGRGGDDTLFAKIDGVVKFERK GRDKKQVSVYAVAE 5NGM_AV 50S ribosomal protein L28; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MGKQCFVTGRKASTGNRRSHALNSTKRRWNANLQKVRILVDGKPKKVWVSARALKSGKVTRV 5NGM_AW 50S ribosomal protein L29; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MVKQMKAKEIRDLTTSEIEEQIKSSKEELFNLRFQLATGQLEETARIRTVRKTIARLKTVAREREIEQSKANQ 5NGM_AX 50S ribosomal protein L30; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAKLQITLTRSVIGRPETQRKTVEALGLKKTNSSVVVEDNPAIRGQINKVKHLVTVEEK 5NGM_AY 50S ribosomal protein L31 type B; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MKQGIHPEYHQVIFLDTTTNFKFLSGSTKTSSEMMEWEDGKEYPVIRLDISSDSHPFYTGRQKFAAADGRVERFNKKFGL KSNN 5NGM_AZ 50S ribosomal protein L32; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MAVPKRRTSKTRKNKRRTHFKISVPGMTECPNCGEYKLSHRVCKNCGSYNGEEVAAK 5NGM_A1 50S ribosomal protein L33; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MRVNVTLACTECGDRNYITTKNKRNNPERIEMKKYCPRLNKYTLHRETK 5NGM_A2 50S ribosomal protein L34; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MVKRTYQPNKRKHSKVHGFRKRMSTKNGRKVLARRRRKGRKVLSA 5NGM_A3 50S ribosomal protein L35; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MPKMKTHRGAAKRVKRTASGQLKRSRAFTSHLFANKSTKQKRQLRKARLVSKSDMKRVKQLLAYKK 5NGM_A4 50S ribosomal protein L36/RNA Complex; Ribosome Cryo-EM Structural Biology Hibernation; HET: MG; 2.9A {Staphylococcus aureus} MKVRPSVKPICEKCKVIKRKGKVMVICENPKHKQRQG

igortru commented 3 years ago

https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/FamilyPhylogeneticTree/Graph/cif2fasta.py possible fix for 5NGM. but need to be checked on whole pdb.

milot-mirdita commented 3 years ago

Could you please submit it as a pull request?

igortru commented 3 years ago

I can , but it is not “complete” fix It is very serious change in code which change names for whole pdb. I just don’t know exactly how pull request is working. I mean ,I provide just example how it can be fixed, and I am not completely sure in correctness.

Sent from my iPhone

On Dec 21, 2020, at 6:26 AM, Milot Mirdita notifications@github.com wrote:

 Could you please submit it as a pull request?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

igortru commented 3 years ago

problem with pdbx_descriptor is very serious my current cif2fasta crash on 4v61, it could be fixed , but pdbx_descriptor contain just one word "Ribosome" real proteins names you can find in _entity.details

1 polymer nat '16S rRNA' 483490.531 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 2 polymer nat 'Ribosomal Protein S2' 26092.336 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 3 polymer nat 'Ribosomal Protein S3' 24965.156 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 4 polymer nat 'Ribosomal Protein S4' 23454.531 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 5 polymer nat 'Ribosomal Protein S5' 33626.363 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 6 polymer nat 'Ribosomal Protein S6' 18870.361 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 7 polymer nat 'Ribosomal Protein S7' 17378.309 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 8 polymer nat 'Ribosomal Protein S8' 15527.256 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 9 polymer nat 'Ribosomal Protein S9' 21350.842 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 10 polymer nat 'Ribosomal Protein S10' 21632.629 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 11 polymer nat 'Ribosomal Protein S11' 15085.706 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 12 polymer nat 'Ribosomal Protein S12' 13794.261 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 13 polymer nat 'Ribosomal Protein S13' 16306.087 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 14 polymer nat 'Ribosomal Protein S14' 11809.832 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 15 polymer nat 'Ribosomal Protein S15' 10778.763 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 16 polymer nat 'Ribosomal Protein S16' 10454.237 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 17 polymer nat 'Ribosomal Protein S17' 15828.694 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 18 polymer nat 'Ribosomal Protein S18' 12337.430 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 19 polymer nat 'Ribosomal Protein S19' 10632.500 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 20 polymer nat 'Ribosomal Protein S20' 21866.373 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 21 polymer nat 'Ribosomal Protein S21' 21701.236 1 ? ? ? 'modeled using Escherichia coli 2AVY as template' 22 polymer nat '23S rRNA' 911368.312 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 23 polymer nat '5S rRNA' 37743.441 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 24 polymer nat '4.8S rRNA' 33330.867 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 25 polymer nat 'Ribosomal Protein L1' 38681.484 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 26 polymer nat 'Ribosomal Protein L2' 29480.168 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 27 polymer nat 'Ribosomal Protein L3' 28423.457 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 28 polymer nat 'Ribosomal Protein L4' 32479.617 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 29 polymer nat 'Ribosomal Protein L5' 24248.189 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 30 polymer nat 'Ribosomal Protein L6' 24747.686 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 31 polymer nat 'Ribosomal Protein L9' 22169.826 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 32 polymer nat 'Ribosomal Protein L11' 23689.980 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 33 polymer nat 'Ribosomal Protein L13' 28211.629 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 34 polymer nat 'Ribosomal Protein L14' 13484.741 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 35 polymer nat 'Ribosomal Protein L15' 27491.221 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 36 polymer nat 'Ribosomal Protein L16' 15328.068 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 37 polymer nat 'Ribosomal Protein L17' 22983.852 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 38 polymer nat 'Ribosomal Protein L18' 17850.727 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 39 polymer nat 'Ribosomal Protein L19' 26121.254 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 40 polymer nat 'Ribosomal Protein L20' 14617.331 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 41 polymer nat 'Ribosomal Protein L21' 28554.883 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 42 polymer nat 'Ribosomal Protein L22' 23276.654 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 43 polymer nat 'Ribosomal Protein L23' 21832.164 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 44 polymer nat 'Ribosomal Protein L24' 21481.088 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 45 polymer nat 'Ribosomal Protein L27' 21771.951 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 46 polymer nat 'Ribosomal Protein L28' 16730.691 1 ? ? ? 'modeled using Thermus thermophilus 2J01 as template' 47 polymer nat 'Ribosomal Protein L29' 19415.543 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 48 polymer nat 'Ribosomal Protein L31' 16060.906 1 ? ? ? 'modeled using Thermus thermophilus 2J01 as template' 49 polymer nat 'Ribosomal Protein L32' 6650.969 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 50 polymer nat 'Ribosomal Protein L33' 7668.121 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 51 polymer nat 'Ribosomal Protein L34' 16126.633 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 52 polymer nat 'Ribosomal Protein L35' 17376.322 1 ? ? ? 'modeled using Escherichia coli 2AWB as template' 53 polymer nat 'Ribosomal Protein L36' 11509.490 1 ? ? ? 'modeled using Escherichia coli 2AWB as template'