gnina / scripts

BSD 3-Clause "New" or "Revised" License
23 stars 83 forks source link

I have a question about how to training using train.py. #47

Closed HyungSik-Jo closed 2 years ago

HyungSik-Jo commented 2 years ago

I want to test learning with only some of my data.

It consists of 1 pdb file and several sdf files.

The types file is written as follows.

ex) 0 5.058014 data/1V4S_rec_test_0.gninatypes data/conf_5899.sdf # 0

Since it is an arbitrary value, rmsd is deleted and the has_rmsd option is also changed to false in model.

But now I am getting the following error

Traceback (most recent call last):
  File "train.py", line 932, in <module>
    results = train_and_test_model(args, train_test_files[i], outname, cont)
  File "train.py", line 499, in train_and_test_model
    solver.step(test_interval)
ValueError: No valid stratified examples.

My guess is that it's a problem with the stratify_receptor option.

If the corresponding option is modified to false, the following error is changed.

ValueError: No valid examples found in training set.

I thought it might be a problem with the recognition of the pdb file, so I tried changing the pdb file to gninatypes using gninatyper , but the same error occurs.

Any solution?

dkoes commented 2 years ago

Do you have balanced on? If so, you must have both positive and negative examples (label in first column). If you only have one receptor you should not stratify by receptor.

HyungSik-Jo commented 2 years ago

As you advised, the balanced option seems to work fine after changing it to false .

However, the stratify receptor option cannot be changed to false due to an error.

Changed the balanced option and the warning is raised. Warning: only one unique label

Can I use that result if there are no major issues?

I still don't know the exact function of the balanced and stratify_receptor options.

dkoes commented 2 years ago

Balancing makes sure there are equal numbers of each label in each batch, oversampling the less common label as necessary. You aren't using the label so you should not have this on. Stratification balances the receptors.

croshong commented 1 year ago

I have a question related to balancing and stratify_receptor.

When I train a model with default2018, It does not have any problem, But when I tried to train a model with dense, there are following error

File "train.py", line 935, in results = train_and_test_model(args, train_test_files[i], outname, cont) File "train.py", line 502, in train_and_test_model solver.step(test_interval) ValueError: No valid stratified examples.

The setting of balancing and stratify_receptor is same between dense.model and default2018.model

and in my training data there are positive and negative labeled samples.

Any possible clue to this error?

Thanks