Problème avec les scop_id dans HOMSTRAD

Bonjour,

C'est tout à fait normal. Cf l'issue n°15, que je vous remets ici. Je vais mette les info dans le README du dossier Partage aussi.

I just updated the SCOP ids, they were wrong! Please take the new ones. You will notice that there are more than one SCOP ids for 46 families. This can be due to 3 reasons:

(1) The reference PDB contains several domains which have different SCOP ids (example: 5_3 endonuclease a.60.7.1 , c.120.1.2),

(2) The PDBs associated to the family (reference PDB + other PDBs whose codes are indicated in the MAP file) have different SCOP ids, although each one of them covers the whole query (example: hexapep b.81.1.5 , b.81.1.1 , b.81.1.2),

(3) The PDBs associated to the family (reference PDB + other PDBs whose codes are indicated in the MAP file) have different SCOP ids, and they do not cover the same parts of the query (example: fer4 d.58.1.4 , i.4.1.1 , d.58.1.5 , d.58.1.1 , d.58.1.2 , d.58.1.3).

In case (2), the different SCOP_ids actually correspond to very similar structures. In cases (1) and (3), they can correspond to different structures. This is clearly shown by the examples given in parentheses.

To evaluate your results, when you place yourself at the level of the families, you can simply rely on the family names and not consider the SCOP ids. You want to see at which rank the real HOMSTRAD family of the query is. When you consider higher levels like SCOP superfamilies and folds, you can consider that you found a valid HOMSTRAD family when it shares at least one SCOP superfamily/fold with the real HOMSTRAD family of the query.

For those of you who are interested in knowing exactly which of the reference+other PDBs have which SCOP_id, here's a file containing such information: http://scop.mrc-lmb.cam.ac.uk/scop/parse/dir.cla.scop.txt_1.75.

meetU-MasterStudents / 2019---2020-partage

Problème avec les scop_id dans HOMSTRAD #24