sokrypton / ColabFold

Making Protein folding accessible to all!
MIT License
1.96k stars 493 forks source link

Index error/ List Index out of range #136

Open dsairam789 opened 2 years ago

dsairam789 commented 2 years ago

I was predicting the structure of a complex using the CoLab (AlphaFold2) by uploading unpaired alignments of the monomers. I encountered the error (List index out of range) at the Prediction step and am unsure of how to resolve it. I tried resetting the run using Factory run, and Run all but the error persists, so could you help me out in this regard.

P.S. After perusing the list of issues flagged by users on this platform, I noticed that my request is similar to this user (https://github.com/sokrypton/ColabFold/issues/128).

I am at your disposal of any further information.

dsairam789 commented 2 years ago

Other details about the issue

Notebook Name: AlphaFold2.ipynb

Input Query : MDADKIVFKVNNQVVSLKPEIIVDQHEYKYPAIKDLKKPCITLGKAPDLNKAYKSVLSGM SAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNW ALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQNTGNYKTNIADRIEQIFETAPFVK IVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV SFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPY SSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFF RDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETRSPEAVYTRIMMNGGRLKRSH IRRYVSVSSNHQARPNSFAEFLNKTYSSDS:MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDD GKSPNPGE

Parameters Checked AMBER : Yes Save to Google_Drive: Yes MSA Mode : Custom Model_Type : Alphafold2-ptm Pair Mode: Unpaired+Paired MSA Mode: Custom Screenshot_2022-01-13 Google Colaboratory Screenshot_2022-01-13 Google Colaboratory(1) MSA_A3M_Files.zip

martin-steinegger commented 2 years ago

You need to express your complex in a single a3m file. My answer to the following issue explains how the MSAs (a3ms) should be formatted: https://github.com/sokrypton/ColabFold/issues/76

dsairam789 commented 2 years ago

Thanks for your reply and the link to the earlier thread. I did exactly as you suggested (on #76) but I get a similar issue. Screenshot_2022-01-13 Google Colaboratory(2) Screenshot_2022-01-13 Google Colaboratory(3) ezyzip.zip

martin-steinegger commented 2 years ago

I made a small example that runs. You need to make sure to separate the header (#) entry by tabs

#423,68 1,1
>P06747 P06025
YKYPAIKDLKKPCITLGKAPDLNKAYKSVLSCMSAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNWALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQSTGNYKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLVSFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETRSPEAVYTRIIMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDSMSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDDGKSPNPGE
>P06747
YKYPAIKDLKKPCITLGKAPDLNKAYKSVLSCMSAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNWALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQSTGNYKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLVSFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETRSPEAVYTRIIMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDS--------------------------------------------------------------------
>UniRef100_A0A0 Nucleocapsid protein (Fragment) n=1 Tax=Rabies lyssavirus TaxID=11292 RepID=A0A096XWU8_9RHAB
----AIKDLKKPSITLGKAPDLNKAYKSVLSGMNAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGILIARKGDKITPDSLVEIKRTDVEGNWALTGGMELTRDPTVSEHASLVGLLLSLYRLSKISGQNTGNYKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLVSFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVHSDDEDYFSGETRSPEAVYTR--------------------------------------------------------------------------------------------------------------
>P06025
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDDGKSPNPGE
>UniRef100_A0A0 Phosphoprotein n=1 Tax=Rabies lyssavirus TaxID=11292 RepID=A0A023GV82_9RHAB
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMRRLQLDDGKPSNL--

sml.a3m.zip

Scrupy07 commented 2 years ago

Hello guys, I found this post when looking for the answer to this "list out of range" error. In my case, I want to predict a protein structure, just with the standard parameters, but this error still appears every time I try. The sequence is:

"MENPMERELSCTICMELYNEPLLLPCAHTFCRKCLEDLIAKSNGFAATATSTCHGDDGGDCRPEADEPCDVPTDSKDSDGIVINCPTCRREVILRSETGLNGLVRNFLLDSIVSRYKQQETSIVRQTCEICDDKPAITKCVQCQVTYCDLCRNMCHPNKGVFNTHQLVPLSHDGPVQRTYGIQPLQCFVHNGEGLKLYCDDCRIPICFICERFGEHKGHHVREAHSAFKTIKEQMSDNVATMIGLQSDIDDFLIALSNGRDSIQEDAASLKEQVSNSCQRLHATINQKEQEMHEVIAKDMTNKLYEIQQKMTMCQDKQRRLSGLVQFAQVVLENNNETSIFLTSAASLDDRLTSELSNSPNLQLNEAFLTFDHIIIDVKSAIRGIKKMQARKITEPKVPIITKTEIMNRNGVMLSVNHEPCDVKWYDVAYQPINGQWMTMRWEVMADQSGASLEDHIHLGNLDYDCEYYFCVCMTNVAGTSPWSQTVAVKTKTVATEFILNEETAHPALLLSEDGKTVKRREDYIHKKMKVSEKIQLGMRFVRNVHCILGDVILSDDAHYWEVEASQSGATSYAIGVATADCHRNQQLGTNKSSWCLEIMGITARGYHNDRCTRIKHNLECNSTRRFGILLDYQRRALEFFYKEKLLLSYAVNAKVTDLCPAFDLTNSSAKLSIITGLKIPEFLNTCHVHIQQKMGDSLERELSCAVCMELYTDPLLLPCAHSFCRSCLPDVLKRNSNQKSGHSRLVCPSCRFTVELDKRGIDGLPRNFLLDNIIERYKEEKSTDGRPVKVKGVACDVCADSGGAKASKTCIQCGVSYCDRCLRTYHPSKGVFSKHKLVKATRNPKRKDVYCPEHDDELIKMYCVQCKTPVCYLCDRFGGHKGHQVAELKTSYKLMKETLSSNLAQLVSKMANVNEFIITLERKGESIQTNAAVMHQRISEEFAVIKAMLEQRERVMHTKISEETARKLLLLKQQNMACQDKLHNTAGLIQYTREVLKEEVPAALLLTGASLDDRLNCAIDSCPQLQPNTADDFSHVILNLEYEKQIIQKMDLLTIKAPEKPRMGGHIEVRNTVHLSIKHAPCIVDSYDMGVCKSGGLWDYFKIECGKGDERETDEYRLVKEDLAFDSEYFFKARVRNKAGASGWSNVFPARTGPQAMTFRLDPETAHEDLVIISAGRSVIYQPRPRGFWVMQEEGKASTGRFHGRALSVLADVVLATGVHYWEVTTQVAEKQHGESLSHYTDRDASVYRGDIAIGLAKQNCNRDLCLGSDGSSWALRLPSNGGNWYVAHKNKQHVISAASAIDSCPRSQFPAGLHVGILVDFTHHKLRFYDCNRKILLYSCDQISKEKRLCPALEISDSSYQLNLRTGTGIPDYANSK" image

At the beginning, everything is fine, but at the modelling level, it keeps running forever until RAM is complete and this error pops up. I don't know exactly what to do and I tried on two different computers and continues to happen. Other errors do appear (I don't have screenshot), but is pretty much the same.

If anyone knows how to solve it I would be really happy.

Scrupy07 commented 2 years ago

This is the error I usually encounter when modelling this protein image

Does anyone know how to fix it?

sokrypton commented 2 years ago

This is an error that you'll see if the previous step failed and didn't return any models. (So none can be displayed).

In this case, that looks like a large protein. On Google-colab. The upper limit is length ~1400.

Scrupy07 commented 2 years ago

Thanks for replying. This protein is 1379, so I think it's near to the limit. Probably I could divide the protein into three fragments which have regions in common and somehow know how it would be seen. What do you think?