topazape / LSTM_Chem

Implementation of the paper - Generative Recurrent Networks for De Novo Drug Design.
The Unlicense
116 stars 55 forks source link

Batchsize in Finetuning is irrelevant #21

Open phquanta opened 2 years ago

phquanta commented 2 years ago

I've noticed that BatchSize even it is there for Fine Tuning it is always 1 to make sure it functions properly. If one wants to set it bigger than 1, then it triggers an error due to a fact that self.max_len=0 and no padding takes place. I don't know how it would affect training if one uses max_len vs not using max_len with batchsize=1.

taoshen99 commented 2 years ago

I'm having a similar error as @phquanta when setting "finetune_batch_size" >=2. It seems that self.max_len would always be 0 when data_type is set as 'finetune'. This causes no 'A' padding to raw SMILES. As a result, each numpy.arrary(X) (line105 in data_loader.py) differs in axis 0 and could not be concatenated. I believe that SMILES in finetune should also be padded to the max length.