Urinx / alphafold_pytorch

An implementation of the DeepMind's AlphaFold based on PyTorch for research
Apache License 2.0
392 stars 92 forks source link

feature.sh does not generate .pkl or .tfrec file from T1019s2.seq #10

Open sean-lin-tw opened 4 years ago

sean-lin-tw commented 4 years ago

Hi, I ran feature.sh with T1019s2.seq as input. It generated the files as follows: image But there wasn't T1019s2.pkl or T1019s2.tfrec. Can you tell me how to correctly generate those files? Thank you!

NicolasBelloy commented 4 years ago

Hi, I have the same issue on my own sequence. After plmDCA and the last python feature.py -s $TARGET_SEQ -f, no pkl nor tfrec file is generated.

nicolasfredesfranco commented 4 years ago

Hi, I have the same problem and output files. I don't understand if I need to merge these files and convert them to .npy (if this is the answer, I don't know what will happen to the text in the files). Probably, I suspect that this is our mistake. It seems that the code has been intended to produce only these files, and that is it. So is another step necessary? Do we need to copy the text from each file and paste it into another larger file or something? Is the code incomplete? I couldn't understand if the author of this code wrote it like this because he thinks it's obvious what to do next (something I don't see) if there is an explanation for this in the original alpha folding code (an element I don't see either), because he doesn't understand what do after these calculation steps to produce the final features or if we don't see something. In any case, I would be very congratulations if someone could help us. Perhaps with the help of the author, we can be helpful to each other.

nicolasfredesfranco commented 4 years ago

I got it !! I added a line to the beginning of the feature.sh OUTPUT = "${TARGET_DIR}/${TARGET} _out" then replace line 20 and 30 with python feature.py -s $TARGET_SEQ -f -o $OUTPUT With this you get the final numpy that you can use as the network input, If you want it as tfrec or as pkl, just convert this file.

nicolasfredesfranco commented 3 years ago

After a deeply study of the code i believe the line 164 of feature.py has to be change for a "continue", like this: else: ----------aln, aln_id = read_aln(fas_file) ----------aln = aln[:, aln[0] != '-'] ----------write_aln(aln, aln_id, aln_file) ----------continue the "exit" produce problems and because of this, in some cases, the code doesn't save a npy

ibr1996 commented 3 years ago

I have the same issue, and I can't figure it out yet. The things that nicolasfredesfranco said didn't work :(

tokiinan5 commented 3 years ago

@nicolasfredesfranco How can I convert a .npy file to tfrec? I want to use the generated feature file with the alphafold-implementation. Thus I need to convert it to tfrec

ibr1996 commented 3 years ago

@nicolasfredesfranco How can I convert a .npy file to tfrec? I want to use the generated feature file with the alphafold-implementation. Thus I need to convert it to tfrec

You don't need the tfrec file, you can use the .npy, just rename the extension to .pkl to not change anything in the code and that's it.

tokiinan5 commented 3 years ago

@nicolasfredesfranco How can I convert a .npy file to tfrec? I want to use the generated feature file with the alphafold-implementation. Thus I need to convert it to tfrec

You don't need the tfrec file, you can use the .npy, just rename the extension to .pkl to not change anything in the code and that's it.

So the generated feature file (as .npy) I can use it directly as an input of the original alpha-fold CASP-13 by just changing the extension to .pkl?

ibr1996 commented 3 years ago

@nicolasfredesfranco How can I convert a .npy file to tfrec? I want to use the generated feature file with the alphafold-implementation. Thus I need to convert it to tfrec

You don't need the tfrec file, you can use the .npy, just rename the extension to .pkl to not change anything in the code and that's it.

So the generated feature file (as .npy) I can use it directly as an input of the original alpha-fold CASP-13 by just changing the extension to .pkl?

Yeah, at least for me worked well, and then with the .rr file that you will get from the alphafold.sh script, you can getthe structure using 3DFuzz, a web server for structure prediction

tokiinan5 commented 3 years ago

@nicolasfredesfranco How can I convert a .npy file to tfrec? I want to use the generated feature file with the alphafold-implementation. Thus I need to convert it to tfrec

You don't need the tfrec file, you can use the .npy, just rename the extension to .pkl to not change anything in the code and that's it.

So the generated feature file (as .npy) I can use it directly as an input of the original alpha-fold CASP-13 by just changing the extension to .pkl?

Yeah, at least for me worked well, and then with the .rr file that you will get from the alphafold.sh script, you can getthe structure using 3DFuzz, a web server for structure prediction

Thank you for your reply. What I wanted actually the generated feature from this project and use that https://github.com/deepmind/deepmind-research/tree/master/alphafold_casp13 in this project as the input. That's why I needed the .tfrec I guess

ibr1996 commented 3 years ago

@nicolasfredesfranco How can I convert a .npy file to tfrec? I want to use the generated feature file with the alphafold-implementation. Thus I need to convert it to tfrec

You don't need the tfrec file, you can use the .npy, just rename the extension to .pkl to not change anything in the code and that's it.

So the generated feature file (as .npy) I can use it directly as an input of the original alpha-fold CASP-13 by just changing the extension to .pkl?

Yeah, at least for me worked well, and then with the .rr file that you will get from the alphafold.sh script, you can getthe structure using 3DFuzz, a web server for structure prediction

Thank you for your reply. What I wanted actually the generated feature from this project and use that https://github.com/deepmind/deepmind-research/tree/master/alphafold_casp13 in this project as the input. That's why I needed the .tfrec I guess

The problem with that project is that you can use it only for the structures used in CASP13, so I don't know how to work with it (I didn't try for that limitation)

Geraldene commented 3 years ago

@nicolasfredesfranco when running the alphafold.py script did you to change the input channels that was specific to your input data. I have generated the .npy file, but when running alphafold.py I am getting the following error: RuntimeError: Given groups=1, weight of size [240, 1878, 3, 3], expected input[1, 1880, 64, 64] to have 1878 channels, but got 1880 channels instead

I am not too familiar with Pytorch so I am not sure how to resolve it, do you have any suggestions?

elephantpanda commented 1 year ago

I got it !! I added a line to the beginning of the feature.sh OUTPUT = "${TARGET_DIR}/${TARGET} _out" then replace line 20 and 30 with python feature.py -s $TARGET_SEQ -f -o $OUTPUT With this you get the final numpy that you can use as the network input, If you want it as tfrec or as pkl, just convert this file.

@nicolasfredesfranco Hi, were you able to make pkl files for different proteins? I am currently implementing this model in Unity but I need some additional example pkl files. Do you have any pkl files for some small proteins that you would be prepared to share? Thanks!