`train_expansion_keras_model` yields error in dimensions of logits and labels #90

hubbs5 commented 1 year ago


I have a large database of over 92k reaction templates and I'm trying to retrain the expansion model. I ran the pre-process expansion step and moved to train, but am encountering the following error:

Examining the Keras model, it seems to be constructed correctly. Here's the summary:

My only thought is that something didn't quite work as expected in the pre-processing step, which led to an issue with the train_seq or valid_seq generator. I'm unsure how to debug this and would be grateful for any advice!

SGenheden commented 1 year ago

Thanks for your feedback.

This is a hard one to debug. What is the size of your training and validation sets?

I will soon release a new package for training these models that hopefully is more robust than these old script, that are frankly not that great.

hubbs5 commented 1 year ago

I did manage to get this fixed - I don't remember what it was exactly but it was related to the pre-processing script. If I recall correctly, it was dropping molecules that couldn't be converted into fingerprints which threw off the dimensions of the data being passed.