huibinshen / fingerid

Molecular fingerprint prediction from MS/MS (FingerID).
Apache License 2.0
17 stars 3 forks source link

error in trainSVM.py with 16 CPUs #3

Closed tobigithub closed 8 years ago

tobigithub commented 8 years ago

Hi, I use 16 physical CPUs and fingerid 1.4, but once I change the number of processors to 16 in the file train_test.py

 # MODELS is the folder to store trained models.
    prob= True # Set prob=True if want probability output
    trainModels(train_ckm, labels, "MODELS", select_c=False, n_p=16, prob=prob)

i get the error below.

~/fingerid/examples$ python train_test.py 

Computing combined kernel for ALIGN
train models and make prediction
Create directory MODELS to store the trained models
Traceback (most recent call last):
  File "train_test.py", line 93, in <module>
    trainModels(train_ckm, labels, "MODELS", select_c=False, n_p=16, prob=prob)
  File "../../fingerid/fingerid/model/trainSVM.py", line 76, in trainModels
    args=(x, labels, model_dir, task_dict[i], prob))
KeyError: 10

just to double check if they are correctly detected, there are indeed 16 CPUs visible to python.

>>> import multiprocessing
>>> 
>>> multiprocessing.cpu_count()
16
>>> 

Anything above 10 CPUs will throw the error. This will not happen in shen_ISMB2014.py, there I can set 32 CPUs and it will still run fine. The run time is the same, only 1 CPU in use (as reported before), Tobias

huibinshen commented 8 years ago

Hi, this is a bug. The error is because the number of process is larger than the number of fingerprints you want to predict.

huibinshen commented 8 years ago

Now the bug is fixed, n_p = min(n_p, n_fp).

tobigithub commented 8 years ago

Thanks