Closed abiadak3 closed 1 year ago
Alphafill assumes a standard layout of the PDB directory, as if you fetched it from PDB-REDO (or the PDB of course). That means, files are located in subdirectories with names of two characters length. Of course, this is inflexible and a bit silly. But if you move your files in a directory called ./pdb/00 and then use ./pdb as argument, the prepare-pdb-list command will probably work better.
I see room for improvement in alhpafill here.
Ok, thanks again. Really it needs a directory structure like this:
pdb/02/102l/102l_final.cif
pdb/02/102d/102d_final.cif
pdb/00/100d/100d_final.cif
Ah, yes, that's the pdb-redo way of storing data.
Should the names in the fasta be the base cif name?
The name in the fasta should be
fourlettercode
, underscore
, asym_id
of the chain
Like in:
>1cbs_A
After I was able to compile alphafill, I have encountered the same issues. If you could provide to users a concreate alphafill.config example, it is really applcated.
gussing alphafill.conf like pdb-dir=./pdb-redo pdb-fasta=./1gos.fasta whic of course does not work.
Another question is that looking at the
alphafill -h
there is any option to put uniprot fasta, rgiht? It is only in the web (alphafill.eu)? Also, not to be confused,
The "--output" option is not recognized, it gives an error message saying "unknown option".
Could you update the README.md Many thanks,
The fasta file should contain all the sequences that are represented in the structure data. I.e. all pdb-redo data. Putting in the Uniprot fasta makes no sense as you do not have associated structure models.
Did you try absolute paths for the data files, e.g.: pdb-fasta=/DATA/pdb-redo/others/pdbredo_seqdb.txt pdb-dir=/DATA/pdb-redo/
Thanks for your feedback, I have downloaded pdb-redo in the local machine, however I don't see others/pdbredo_seqdb.txt this file. Probably, I failed to download all files? pdb-redo, it has total 573G. How much pdb-redo db total has?
Many thanks,
It is over a TB, but that includes a lot of data you don't need. You only need the mmCIF files called ????_final.cif
You can get the sequence file through a browser but you need to be logged in on pdb-redo.eu. (https://pdb-redo.eu/others/pdbredo_seqdb.txt)
Many thanks, I was able to install all pdb-redo cif files. Alphafill works nicely in a local machine.
However, I have one question regarding to alphafill DB. alphafill_DB was downloaded more than year ago in the local machine. On the current web (https://alphafill.eu/), the DB is updated?
For example, uniprot id, A0A5P2XKZ4 is found on the web, but not locally installed one. If I download
rsync -av rsync://rsync.alphafill.eu/alphafill/ alphafill/
it will be syncronyzed with current alphafill website DB?
Thanks,
Yes
currently updating alphafill DB, could you tell me how big it is? Thanks,
172 GB but that will gradually go up
Hello, would alphafill work with gzipped cif files or should I decompress them?
Alphafill should work with gzipped files.
I'm having some issues running the function
prepare-pdb-list
for filtering the input PDB list.I'm running it like this:
alphafill prepare-pdb-list --pdb-dir=./pdb --pdb-fasta=pdb_seqres-test.txt
In the "pdb" directory, there are 1000 PDBs in CIF format, and some of them contain ligands. The file "pdb_seqres-test.txt" contains the sequences in FASTA format corresponding to the files in the "pdb" directory.
The program doesn't list the PDBs with ligands, it simply outputs a blank line in less than 1 second. By tracing the program with strace, it only accesses the files "af-ligands.cif", "alphafill.conf", and the "pdb/" directory. It does not read the ".cif" files in the "pdb/" or the "pdb_seqres-test.txt" file.
The "--output" option is not recognized, it gives an error message saying "unknown option".
The program does not run if the alphafill.conf file does not exist in the directory. It gives an error message saying "the specified config file was not found". On the other hand, I cannot find in the documentation which is the format of that file and a list of the options that can be included with their explanations.
Maybe I'm doing something wrong?