mojaie / MolecularGraph.jl

Graph-based molecule modeling toolkit for cheminformatics
MIT License
197 stars 29 forks source link

largestcomponent comment #34

Closed Kelvyn88 closed 3 years ago

Kelvyn88 commented 3 years ago

Hi to all

I was working in writting some functional groups and I noticed this:

To identify largest components, the simply ones should be define first (.yaml file) to proper identification.

Thanks

mojaie commented 3 years ago

Thank you for sharing your feedback! Maybe it is not I intended. It is true that functional group search is done by the order that defined in the .yaml file, but sub/superset relationship should be determined by the ontology graph regardless of .yaml file order. I'd be happy if you provide example code of your case.

Kelvyn88 commented 3 years ago

Sure, it is a very short example. Let's try to identify the following molecule:

"S(SC)C"

image

with the following list for functional groups:

# 1
- key: "CH3"
  query: "[CH3D1;!R]"

# 143
- key: "CH3S"
  have: ["-S-", "CH3"]
  query: "[CH3;!R][#16;!R]"

# 185
- key: "-S-"
  query: "[SX2]"

The results is only CH3 and -S-. If I reorder the groups to first define the simpliest molecules as:

# 1
- key: "CH3"
  query: "[CH3D1;!R]"

# 185
- key: "-S-"
  query: "[SX2]"

# 143
- key: "CH3S"
  have: ["-S-", "CH3"]
  query: "[CH3;!R][#16;!R]"

The results iis CH3, -S- and CH3S for all groups, and largestcomponents yields CH3S, the correct identification.

Thanks Kelvyn

mojaie commented 3 years ago

Thank you for providing the example. I checked the module, and found some codes that should be fixed. The workaround is to order the queries as you did.

Kelvyn88 commented 3 years ago

Thanks @mojaie

Another issue related to the same topic. With the molecule:

image

by using the same functional groups as above:

image

and identifying the largestcomponents:

image

The problem here is that both groups share "S", the correct results should be 1 - CH3S and 1 - CH3. The function identify twice the same group, where only one S is available.

I would be very grateful if you can provide me more insight about this.

Thanks Kelvyn

mojaie commented 3 years ago

I'm sorry for late response. This seems to be same as #31. Your task is a kind of graph matching but this library does not offer it yet.