Sabryr / ProteinFolding

MIT License
2 stars 1 forks source link

Compare Alphafold and RoseTTAFold on Fox compute cluster. Same SLURM configuration was used for all analysis

SBATCH --time=10:10:00

SBATCH --nodes=1

SBATCH --ntasks=96

SBATCH --mem=1000G

SBATCH --partition=accel --gpus=4

Analysis on a single whole node with

Command used

RoseTTAFold

module purge module load RoseTTAFold/1.0.0.2

run_pyrosetta_fox.sh query.fasta . predictcomplex.py -i t000.msa0.a3m -o complex -Ls 218 310

Alphafold

module load Alphafold/2.0.0.1

module load AlphaFold/2.0.0-fosscuda-2020b

ALPHAFOLD_DATA=/cluster/shared/databases/AlphaFold/full run.sh -d $ALPHAFOLD_DATA -o out -m model_1,model_2,model_3, model_4,model_5 -f query.fasta -t 2020-05-14

Time comparisen

Protein # of AA Alphafold RoseTTAFold
UniProt-O76024 905 01:44:44 2:24:53
UniProt-P16736 972 02:15:13 4:20:47
UniProt-P35575 363 00:51:58 1:10:13
UniProt-Q9HVT1 216 00:39:17 0:45:57
UniProt-Q9HVT1_No 196 00:38:44 0:46:53

Sequences

sp|O76024|WFS1_HUMAN Wolframin MDSNTAPLGPSCPQPPPAPQPQARSRLNATASLEQERSERPRAPGPQAGPGPGVRDAAAP AEPQAQHTRSRERADGTGPTKGDMEIPFEEVLERAKAGDPKAQTEVGKHYLQLAGDTDEE LNSCTAVDWLVLAAKQGRREAVKLLRRCLADRRGITSENEREVRQLSSETDLERAVRKAA LVMYWKLNPKKKKQVAVAELLENVGQVNEHDGGAQPGPVPKSLQKQRRMLERLVSSESKN YIALDDFVEITKKYAKGVIPSSLFLQDDEDDDELAGKSPEDLPLRLKVVKYPLHAIMEIK EYLIDMASRAGMHWLSTIIPTHHINALIFFFIVSNLTIDFFAFFIPLVIFYLSFISMVIC TLKVFQDSKAWENFRTLTDLLLRFEPNLDVEQAEVNFGWNHLEPYAHFLLSVFFVIFSFP IASKDCIPCSELAVITGFFTVTSYLSLSTHAEPYTRRALATEVTAGLLSLLPSMPLNWPY LKVLGQTFITVPVGHLVVLNVSVPCLLYVYLLYLFFRMAQLRNFKGTYCYLVPYLVCFMW CELSVVILLESTGLGLLRASIGYFLFLFALPILVAGLALVGVLQFARWFTSLELTKIAVT VAVCSVPLLLRWWTKASFSVVGMVKSLTRSSMVKLILVWLTAIVLFCWFYVYRSEGMKVY NSTLTWQQYGALCGPRAWKETNMARTQILCSHLEGHRVTWTGRFKYVRVTDIDNSAESAI NMLPFFIGDWMRCLYGEAYPACSPGNTSTAEEELCRLKLLAKHPCHIKKFDRYKFEITVG MPFSSGADGSRSREEDDVTKDIVLRASSEFKSVLLSLRQGSLIEFSTILEGRLGSKWPVF ELKAISCLNCMAQLSPTRRHVKIEHDWRSTVHGAVKFAFDFFFFPFLSAA

sp|P16736|HELI_HCMVA DNA replication helicase MSMTASSSTPRPTPKYDDALILNLSSAAKIERIVDKVKSLSRERFAPEDFSFQWFRSISR VERTTDNNPSAATTAAATTTVHSSASSSAAAAASSEAGGTRVPCVDRWPFFPFRALLVTG TAGAGKTSSIQVLAANLDCVITGTTVIAAQNLSAILNRTRSAQVKTIYRVFGFVSKHVPL ADSAVSHETLERYRVCEPHEETTIQRLQINDLLAYWPVIADIVDKCLNMWERKAASASAA AAAAACEDLSELCESNIIVIDECGLMLRYMLQVVVFFYYFYNALGDTRLYRERRVPCIIC VGSPTQTEALESRYDHYTQNKSVRKGVDVLSALIQNEVLINYCDIADNWVMFIHNKRCTD LDFGDLLKYMEFGIPLKEEHVAYVDRFVRPPSSIRNPSYAAEMTRLFLSHVEVQAYFKRL HEQIRLSERHRLFDLPVYCVVNNRAYQELCELADPLGDSPQPVELWFRQNLARIINYSQF VDHNLSSEITKEALRPAADVVATNNSSVQAHGGGGSVIGSTGGNDETAFFQDDDTTTAPD SRETLLTLRITYIKGSSVGVNSKVRACVIGYQGTVERFVDILQKDTFIERTPCEQAAYAY SLVSGLLFSAMYYFYVSPYTTEEMLRELARVELPDVSSLCAAAAATAAAPAWSGGENPIN NHVDADSSQGGQSVPVSQRMEHGQEETHDIPCLSNHHDDSDAITDAELMDHTSLYADPFF LKYVKPPSLALLSFEETVHMYTTFRDIFLKRYQLMQRLTGGRFATLPLVTYNRRNVVFKA NCQISSQTGSFVGMLSHVSPAQTYTLEGYTSDNVLSLPSDRHRIHPEVVQRGLSRLVLRD ALGFLFVLDVNVSRFVESAQGKSLHVCTTVDYGLTSRTAMTIAKSQGLSLEKVAVDFGDH PKNLKMSHIYVAMSRVTDPEHLMMNVNPLRLPYEKNTAITPYICRALKDKRTTLIF

sp|P35575|G6PC1_HUMAN Glucose-6-phosphatase catalytic subunit 1 MEEGMNVLHDFGIQSTHYLQVNYQDSQDWFILVSVIADLRNAFYVLFPIWFHLQEAVGIK LLWVAVIGDWLNLVFKWILFGQRPYWWVLDTDYYSNTSVPLIKQFPVTCETGPGSPSGHA MGTAGVYYVMVTSTLSIFQGKIKPTYRFRCLNVILWLGFWAVQLNVCLSRIYLAAHFPHQ VVAGVLSGIAVAETFSHIHSIYNASLKKYFLITFFLFSFAIGFYLLLKGLGVDLLWTLEK AQRWCEQPEWVHIDTTPFASLLKNLGTLFGLGLALNSSMYRESCKGKLSKWLPFRLSSIV ASLVLLHVFDSLKPPSQVELVFYVLSFCKSAVVPLASVSVIPYCLAQVLGQPHKKSL

sp|Q9HVT1|Y4490_PSEAE MRRLTAFGLALLLLASGVARGEPAVTLDPQQSQVFRAWFVRIAQEQLRQGPSPRWHQQDC AGLVRFAANEALKVHDGKWLRANGLSNRYLPPELALSPEQRRLAQNWQQGGGQVGPYVNA IKLVQFNSRLVGRDLNQARPGDLMFYDQGDDQHLMIWMGRSIAYHTGSSTPTDNGMRSVS LQQLMTWKDTRWIPDESNPNFIGIYRLAFLSQ

Q9HVT1_NoSignalPeptide GEPAVTLDPQQSQVFRAWFVRIAQEQLRQGPSPRWHQQDC AGLVRFAANEALKVHDGKWLRANGLSNRYLPPELALSPEQRRLAQNWQQGGGQVGPYVNA IKLVQFNSRLVGRDLNQARPGDLMFYDQGDDQHLMIWMGRSIAYHTGSSTPTDNGMRSVS LQQLMTWKDTRWIPDESNPNFIGIYRLAFLSQ