dellacortelab / prospr

ProSPr: Protein Structure Prediction
MIT License
418 stars 85 forks source link

Correct Path #14

Closed aai97 closed 4 years ago

aai97 commented 4 years ago

I have issues getting prospr running. I have installed docker and prospr on my ubuntu machine. I have my own amino acid sequence in a fasta file at home/prospr-data/data/TvlDH/TvlDH.fasta, as well as an a3m file from a previous hhblits run. Within the home/prospr-data/data directory I also have a folder called hhblits and one called psiblast, both yet empty.

Now if I run the command proposed by you: docker run -t -v /path/to/local/data:/data prospr/prospr -t --a3mfile /data/my_file.a3m specifically docker run -t -v home/prospr-data/data:/data prospr/prospr -t --a3mfile /data/TvlDH/TvlDH.a3m I get an error:

prospr.py: error: argument command: invalid choice: '/data/my_file.a3m' (choose from 'run', 'build')

I get the same error no matter what combinations of paths I try to use, and no matter if I use the .fasta file or the .a3m file that I had previously generated using hhblits.

So if I instead run docker run -t -home/prospr-data/data prospr/prospr build TvlDH or docker run -t -v home/prospr-data/data:/data prospr/prospr build TvlDH or really anything of the form docker run -t -v <anything here> prospr/prospr build <anything there> I get:

Would you like to download uniclust for hhblits? (25GB) [y/n]:

However, if I type 'y', nothing happens (the terminal starts a process, but uses no cpu), same if I type 'no'.

I'm at a loss here, especially because when I check prospr.py, it seems that the program fails at a simple print on line 85; when I type 'no' it fails to print anything.

Does anyone have any idea what could be going on here?

aai97 commented 4 years ago

My best guess is that at least the second problem has something to do with pconf.basedir, since it shows up in both prospr.py and hhblits.py. I have tried to modify the path in pconf.py, but that has not worked yet either.

aai97 commented 4 years ago

Nevermind for now. I just realized one doesn't seem to have to use Docker. I am now trying to run everything using the provided python scripts directly and will give an update tomorrow on whether it worked or not, and if it did how it worked. Right now I am downloading the hhblits database so that is already some progress.

aai97 commented 4 years ago
  1. Download the prospr-master and unpack it.
  2. Create a directory for your data.

    In here will go a directory hhblits/ and psiblast/ for the respective databases, and you will probably want to (maybe have to, haven't checked that yet) keep your amino acid sequences in here. For me, this folder is called <path-to>/prospr-data, and I will save my amino acid sequence asdf.fasta for a protein asdf in <path-to>/prospr-data/asdf/asdf.fasta. You don't have to create the hhblits and psiblast directories, unless you want to use already-installed databases.

  3. Modify the following file:
    • In <path-to>/prospr-master/prospr/pconf.py change the variable basedir to the base directory of where all your prospr data will be created and kept, in my case <path-to>/prospr-data.
    • If you want to use a different hhblits database you can modify the hhblits.py file in the same directory accordingly. The same probably goes for psiblast and psiblast.py, I haven't checked.
  4. Get your amino acid sequence into .fasta format and place it in your data directory.
  5. Open a terminal window and run the following commands: cd <path-to>/prospr-master python3 prospr.py build <path-to>/prospr-data/asdf/asdf.fasta

    If you haven't placed a database in the <path-to>/prospr-data/hhblits directory you will get the following message:

    Would you like to download uniclust for hhblits? (25GB) [y/n]:

    Type y and hit enter. The download of the 25GB database should start now.