julianstanley / ProteinFeatures

Feature extraction from .pdb files
4 stars 1 forks source link

DISOPRED/SPPIDER integration #3

Open julianstanley opened 5 years ago

julianstanley commented 5 years ago

Should be straightforward. Need to think about file format.

This should be split into 3 files.

1 file should have protein binding call and score from sppider (for protBindSPPIDER_res)\

Every files needs: 1) PDB file name (a1) 2) AA (a2) 3) AA position in the structure (a3)

Data columns: 4) SPPIDER call (a4) 5) DISOPRED disordered call (a5) 6) DISOPRED disordered score (a6) 7) DISOPRED protein binding call (a7) 8) DISOPRED protein binding score (a8)

julianstanley commented 5 years ago

Create input files from the original concatenated file:

cut -d, -f1,2,3,7,8 disopred_all.csv | tail -n +2 > disopred_binding.csv, etc.

cut to remove specific columns. tail -n +2 to remove header. Input file should not have a header. (Consider adding a check for that in the future so files with headers work as well)