hcji / PyFingerprint

Python tool for generate fingerprints of a molecule
GNU Affero General Public License v3.0
68 stars 31 forks source link

Various updates and improvements #13

Closed Jnelen closed 11 months ago

Jnelen commented 11 months ago

Hi there,

I have been using your tool for a lot of my own calculations. However, some of the fingerprints did not work properly for me, so I tried to fix it here. Along with this, I also added some other fingerprints that are easily generated using the libraries that were already implemented. However I couldn't get the signature (from CDK) and heteroencoder fingerprints to work. I also updated the CDK version, and made various other improvements.

P.S.: If you get heteroencoder to work, please let me know. If its not possible, I would consider removing the support for that so we don't have to install all the depencies, and we can update the RDKit version. That way molecular descriptor fingerprint I added would also include a lot more descriptors.

hcji commented 11 months ago

I will check heteroencoder part as soon as possible.

Jnelen commented 11 months ago

I will check heteroencoder part as soon as possible.

Great, thanks! At first I thought the error was because of an imcompatibility wih h5py and tensorflow version. (Which is why I specified it in the dependencies in the pyfingerprint_env.yml). There also was an error with opening up the zip file (wrong filename, but even after changing that it doesn't work. For me it specifically failed when reading the mol_to_latent_model.h5 file.

Also, maybe we should consider bumping the version to 3.0? I made quite a lot of changes so it matches the default settings better, but this also means that some fingerprints will give different results now compared to the previous version. I also added some preprocessing (setting atom types, adding hydrogens, ...) so the results should be more consistent across the board. A version change might make it more clear that in some cases there are "incompatibilities" with version 2.0.

hcji commented 11 months ago

I will check heteroencoder part as soon as possible.

Great, thanks! At first I thought the error was because of an imcompatibility wih h5py and tensorflow version. (Which is why I specified it in the dependencies in the pyfingerprint_env.yml). There also was an error with opening up the zip file (wrong filename, but even after changing that it doesn't work. For me it specifically failed when reading the mol_to_latent_model.h5 file.

Also, maybe we should consider bumping the version to 3.0? I made quite a lot of changes so it matches the default settings better, but this also means that some fingerprints will give different results now compared to the previous version. I also added some preprocessing (setting atom types, adding hydrogens, ...) so the results should be more consistent across the board. A version change might make it more clear that in some cases there are "incompatibilities" with version 2.0.

I m focusing on another project at present. So it may need some time. when I finish the checking of heteroencoder, it may be appropriate to update the version number. I test the function successfully before, so I believe it due to imcompatibility environment caused by some updated dependency.