Closed SabariKumar closed 10 months ago
Hi @SabariKumar, Thanks for reporting. In the same environment, could you please check whether you can successfully run the training example described here: https://github.com/GT4SD/gt4sd-core/tree/main/examples/regression_transformer#finetuning
This substitutes the train/test path with the test files inside the gt4sd
directory. If such a training is successful, I would suspect your error is due to poor data formatting.
Also, the first line of the error is:
site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
Maybe you have a single-token molecule like C
or similar? It's a bit suspicious that you can complete 6% of the epoch before the error occurs
Closing due to inactivity, feel free to comment if issue persists
Training of regression transformer fails with tensor shape mismatch.
To Reproduce
gt4sd-trainer --training_pipeline_name regression-transformer-trainer --model_path ~/.gt4sd/algorithms/conditional_generation/RegressionTransformer/RegressionTransformerMolecules/qed --do_train --output_dir /home/sabari/PhotoChem/VerdeDB/regression_transformer --train_data_path /home/sabari/rt_test/train.csv --test_data_path /home/sabari/rt_test/test.csv --overwrite_output_dir --eval_steps 200 --augment 1 --eval_accumulation_steps 1 --num_train_epochs 100
Expected behavior Training script completes successfully
Screenshots Error Stacktrace:
System (please complete the following information):
Additional context Hello, I'm trying to run fine-tuning training using the QED regression transformer model on a custom dataset. My train/test csvs consist of a single "text" column containing SMILES strings, and a single property column "prop0". Training fails with a tensor shape mismatch in the second (ie., index=1 dimension), regardless of the data augmentation value.