glygener / glygen-issues

Repository for public GlyGen tickets
GNU General Public License v3.0
0 stars 0 forks source link

Evaluate validation logic in glycotree #540

Open ReneRanzinger opened 1 year ago

ReneRanzinger commented 1 year ago

As part of the request "Better/more obvious integration of "validated" structure annotations from sandbox/glycotree" from the F2F, @edwardsnj will look into the validation logic in glycotree. Afterwards we will talk about this in a Wednesday meeting and decide how we can use this in GlyGen.

Possible uses are:

edwardsnj commented 1 year ago

I have looked into this a bit. The concept of validation and glycotree alignment to canonical nodes is quite a mess, and the on-demand alignment is potentially quite expensive. Enzymes and rules are not used in any way in the alignment to the tree, these are decorated after the fact. The rules can only be applied in code (for JSON output), not directly by the database, and either declare a residue abiotic (but residues themselves are either validated or not), or exclude enzymes. It is not possible to directly find all structures with "good" enzymes for all residues without "running" the rule infrastructure individually for each structures' alignment. Lastly, the logic of multiple rules (multiple residues required, at least one of these residues required) is unclear - AND / OR? and logic of application to structures with unassigned residues.

I have checked that the provided API is in place a returns a valid JSON document. The web-service for already aligned structures is in place too. Its just not clear to me that the semantics of what is computed and available for interpretation is particularly useful in its current form.

New rule types are needed. Better conceptualization of whether rules are for enzymes or for species or for chemical constraints. Better handling of whether an enzyme can be used. Better logic, all types of mess to resolve.

ReneRanzinger commented 1 year ago

@edwardsnj I am not sure we want to use the on-the-fly API. We should calculate caveats etc. when we do the glycan update.

ReneRanzinger commented 11 months ago

Needs group input.