microsoft / syntheseus

Package for Retrosynthetic Planning
https://microsoft.github.io/syntheseus/
MIT License
111 stars 14 forks source link

Unify backward reaction models from `search` and `reaction_prediction` #73

Closed AustinT closed 6 months ago

AustinT commented 7 months ago

Following discussion from #58, this PR unifies the base reaction model class from search and reaction_prediction, finally allowing all code to use a single reaction model base class in all places.

The new class lives in syntheseus/interface/models.py and is mainly an extended version of the earlier ReactionModel class to include:

The following modifications were made to use the new class:

After this, search/reaction_models still contained code for toy reaction models. I thought this was a bit out of place so I moved it to reaction_prediction/inference/toy_models.py and included the word "toy" in the class names to emphasize that these are toy models. I also moved the tests. This allowed search/reaction_prediction to be deleted entirely.

AustinT commented 6 months ago

Looks good so far, but we also need to adapt the "Quick Start" tutorial

My understanding was that if we change the docs it will be updated immediately. Therefore I thought it would make the most sense to update the tutorials when the new release (0.4.0) is ready. This could happen in a separate PR once this PR is merged and the issue with single-step eval is fixed. Does that sound like a good plan @kmaziarz ?

kmaziarz commented 6 months ago

My understanding was that if we change the docs it will be updated immediately. Therefore I thought it would make the most sense to update the tutorials when the new release (0.4.0) is ready.

Hmm, yeah, if one follows the tutorial using 0.3.0 (instead of installing from source) then we shouldn't change it before we release the new version... I guess we needed versioned docs after all 😄

AustinT commented 6 months ago

My understanding was that if we change the docs it will be updated immediately. Therefore I thought it would make the most sense to update the tutorials when the new release (0.4.0) is ready.

Hmm, yeah, if one follows the tutorial using 0.3.0 (instead of installing from source) then we shouldn't change it before we release the new version... I guess we needed versioned docs after all 😄

I think it's fine, especially since the docs are fairly minimal. We can set it up later once the docs are a bit better.

kmaziarz commented 6 months ago

On the topic of get_results in eval_single_step.py, it now repeats the logic for deduplicating the outputs; I think instead of having skip_repeats be an argument to get_results we should just pass the respective config option when creating the model via get_model.

Digging this up, I think it's actually more than duplicated logic: the skip_repeats argument in eval_single_step.py will turn the deduplication logic in the script on/off, but that will have no effect due to deduplication already being turned on in the model itself. I think we can again make use of get_model accepting arbitrary kwargs and do

model = get_model_fn(config, remove_duplicates=config.skip_repeats)

After that we can get rid of the deduplication logic from eval_single_step.py entirely.

AustinT commented 6 months ago

After that we can get rid of the deduplication logic from eval_single_step.py entirely.

Good point, I will make this change shortly (just going for lunch now).

AustinT commented 6 months ago

After that we can get rid of the deduplication logic from eval_single_step.py entirely.

I pushed another commit which does this. What do you think? @kmaziarz?