cansyl / DEEPScreen

DEEPScreen: Virtual Screening with Deep Convolutional Neural Networks Using Compound Images
109 stars 44 forks source link

ValueError: cannot reshape array of size 13797420 into shape (200,200,1) #1

Closed ghost closed 4 years ago

ghost commented 4 years ago

I am attempting to reproduce the results in your paper and then train models on my own dataset, but several models failed to train, saying "ValueError: cannot reshape array"

Any idea on how to fix this??

Traceback (most recent call last): 
  File "trainDEEPScreenDUDE.py", line 226, in <module>
    trainModelTarget(model_name, trgt, optim, learning_rate, n_epoch, n_of_h1, n_of_h2, dropout_keep_rate, rotate,save_model)
  File "trainDEEPScreenDUDE.py", line 51, in trainModelTarget
X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1)
ValueError: cannot reshape array of size 13797420 into shape (200,200,1)
ghost commented 4 years ago

I investigated a ittle bit and found the problem may probably arise from dataProcessing.py. The function drawMolFromSmiles does not work properly. It generates .svg files with size 250X250, and when the .svg files are converted to .png, the size becomes 266X266 even if the IMG_SIZE is strictly set to 200. More serious problems appear later: the command img_arr = cv2.imread(path, cv2.IMREAD_GRAYSCALE) gives an array with 266X266 for each img, and the elements in the arrays are identically 255, obviously, the imgs are not properly processed. Attached are two examples. Any idea??

example.zip

tuncadogan commented 4 years ago

Thank you for raising the issue and for the explanation. We are investigating the issue and we will reply again, as soon as we addressed it.

ahmetrifaioglu commented 4 years ago

I have run the script without getting any error and unfortunately I could not reproduce the error that you got. I checked the image sizes and they are all 200x200 images. I also checked the shape of img_arr and it is (200, 200).

From the error it seems that cairosvg or Draw.MolToFile creates a different-sized images although we specify the image size as 200x200. The only thing that came to my mind is that we are using the different versions of the libraries and the new functions somehow put outer frames or something similar to the images. The exact versions that we are using are written in the readme page. We will check this issue further and let you know if we can reproduce the error and find a solution.

ghost commented 4 years ago

My tools include (new anaconda env, the packages are install from conda install): python 3.5.6 tensorflow 1.10.0 gpu_py35hd9c640d_0
tensorflow-base 1.10.0 gpu_py35had579c0_0
tensorflow-gpu 1.10.0 hf154084_0
tflearn 0.3.2 py35h05ed11d_0 contango scikit-learn 0.19.2 py35h4989274_0
numpy 1.14.5 py35h1b885b7_4
numpy-base 1.14.5 py35hdbf6ddf_4
cairosvg 2.4.2 py_0 conda-forge rdkit 2018.03.4 py35ha4bbe77_0 conda-forge opencv3 3.1.0 py35_0 menpo

tuncadogan commented 4 years ago

Unfortunately, we could not replicate the error no matter what we tried. Could you please try with the given tool/library versions in our repository:

Python 3.5.2 Tensorflow 1.12.0 Tflearn 0.3.2 Sklearn 0.19.2 Numpy 1.14.5 CairoSVG 2.1.2 RDkit 2016.09.4 OpenCV 3.3.0

We hope that we can have a better idea about the issue if you could try this. Thank you.

ghost commented 4 years ago

Sure. I'll try this to see what is happening. Thank you very much.

ghost commented 4 years ago

I'm sorry, I deployed these packages, but still got error messages, the shape of the graph remains 266X266, so it breaks. Is there something related to my OS, CentOS7 GPU platform.

By the way, I got some warnings:

(deepscreen) [ai_robot@gpu bin]$ python trainDEEPScreenDUDE.py ImageNetInceptionV2 hdac8 adam 0.0001 5 0 0 1 1 1
WARNING:tensorflow:From /data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.5/site-packages/tflearn/initializations.py:119: UniformUnitScaling.__init__ (from tensorflow.python.ops.init_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior.
WARNING:tensorflow:From /data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.5/site-packages/tflearn/objectives.py:66: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
2020-03-05 08:38:29.950471: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Number of active compounds :    78
Number of inactive compounds :  117
Number of active test compounds :       20
Number of inactive test compounds :     30
(266, 266)
(266, 266)
(266, 266)
(266, 266)
...

OK, after some struggling, I set scale=200/266 in svg2png in dataprocessing.py, and the shape of the png files becomes 200X200, and the training is now in process.

While I can train models now, I also want to predict the activity of some SMILES, I run the command

python loadDEEPScreenModel.py  CHEMBL286 CNNModel_CHEMBL286_adam_0.0005_15_256_0.6_True-525 sample_test_compound_file.txt

but was informed

Traceback (most recent call last):
  File "loadDEEPScreenModel.py", line 88, in <module>
    loadModel(chembl_target, model_fl)
  File "loadDEEPScreenModel.py", line 47, in loadModel
    chembl_target_threshold_dict = getModelThresholds("deepscreen_models_hyperparameters_performance_results.tsv")
  File "/home/ai_robot/data/DEEPScreen/bin/dataProcessing.py", line 1172, in getModelThresholds
    log_fl, modelname, target, optimizer, learning_rate, epoch, hidden1, hidden2, dropout, rotate, save_model, test_f1score, test_mcc, test_accuracy, test_precision, test_recall, test_tp, test_fp, test_tn, test_fn, test_threshold, val_auc, val_auprc, test_auc, test_auprc = line.split("\t")
ValueError: not enough values to unpack (expected 25, got 20)

I know this is due to the data structure: getModelThresholds function is reading and extracting information from resultFiles/deepscreen_models_hyperparameters_performance_results.tsv, however, getModelThresholds is searching for 25 columns while that file only contains 20 colunmns. I looked at that file, and found the following 5 columns are not included in the .tsv file:

test_tp
test_fp
test_tn
test_fn
test_threshold

I cannot just remove these 5 columns in the getModelThresholds because it returns a value that is indeed test_threshold. I can imagine that if these column names are removed from dataProcessing.py, there will be numerous errors. Could you please provide the file with the 25 columns, or please let me know how to generate these files after a training? Thank you vety much.

tuncadogan commented 4 years ago

Sorry for the late reply. First of all, considering the image size issue:

We still could not figure out what is causing this problem. The systems we have tested DEEPScreen is MacOS (10.12 or newer) and Linux Ubuntu (14.04). If you have chance to try on one of these OS we can have a better idea. Sorry that we could not solve it on our end.

Considering rescaling, since the compound images will be different from natively generated 200x200 images, there may be some performance differences. We have covered similar issues, by doing a few tests, in our manuscript (in supplementary material).

Second, about the problem of columns in resultFiles/deepscreen_models_hyperparameters_performance_results.tsv, thank you for letting us know about this problem, this is due to an update in our repository. We are working on constructing the file with the correct number of columns (including the necessary information in the table), and we will upload the correct file once the process is finished.

ghost commented 4 years ago

Sorry for the late reply. First of all, considering the image size issue:

We still could not figure out what is causing this problem. The systems we have tested DEEPScreen is MacOS (10.12 or newer) and Linux Ubuntu (14.04). If you have chance to try on one of these OS we can have a better idea. Sorry that we could not solve it on our end.

Considering rescaling, since the compound images will be different from natively generated 200x200 images, there may be some performance differences. We have covered similar issues, by doing a few tests, in our manuscript (in supplementary material).

Second, about the problem of columns in resultFiles/deepscreen_models_hyperparameters_performance_results.tsv, thank you for letting us know about this problem, this is due to an update in our repository. We are working on constructing the file with the correct number of columns (including the necessary information in the table), and we will upload the correct file once the process is finished.

Thank you so much. I'll try to test this on other OS. Looking forward to seeing your updated files.

cristianregep commented 4 years ago

I have the same issue reported here on Ubuntu 18.04. You can see that with the original code the images that get generated have the following specification: CHEMBL288346.svg SVG 250x250 250x250+0+0 16-bit sRGB 21.4KB 0.000u 0:00.009

tuncadogan commented 4 years ago

I have the same issue reported here on Ubuntu 18.04. You can see that with the original code the images that get generated have the following specification: CHEMBL288346.svg SVG 250x250 250x250+0+0 16-bit sRGB 21.4KB 0.000u 0:00.009

Thank you for your interest. We could not reproduce this error no matter what we tried. Would it be possible for you to try it with the given tool/library versions in our repository:

Python 3.5.2 Tensorflow 1.12.0 Tflearn 0.3.2 Sklearn 0.19.2 Numpy 1.14.5 CairoSVG 2.1.2 RDkit 2016.09.4 OpenCV 3.3.0

cristianregep commented 4 years ago

Anaconda finds those combinations incompatible with each other so that's a no go. I did change lines 219 to 221 in bin/dataProcessing.py to the below code and it works

Draw.MolToFile(mol, "{}/{}.svg".format(output_path,id), size= ( 160 , 160 )) cairosvg.svg2png(url='{}/{}.svg'.format(output_path,id), write_to="{}/{}.png".format(output_path,id), output_width=200, output_height=200)

I fully realise this doesn't make sense (the 160 bit), but it works.

tuncadogan commented 4 years ago

That sounds like a suitable quick-fix.

Also, we plan to be working on fixing all these errors. We hope that it will be ready to be deployed soon.

ghost commented 4 years ago

That sounds like a suitable quick-fix.

Also, we plan to be working on fixing all these errors. We hope that it will be ready to be deployed soon.

Glad to know that you are currently working on these issues, and looking forward to seeing the release.

HaseebYounis2 commented 4 years ago

Respected Sir, I am attempting to reproduce the results in your paper but there is an error shown below. Even I am using the python and libraries versions the same as you described.

import cairocffi as cairo

File "C:\Users\Haseeb Younas\AppData\Local\Programs\Python\Python35\lib\site-packages\cairocffi__init.py", line 50, in ('libcairo.so', 'libcairo.2.dylib', 'libcairo-2.dll')) File "C:\Users\Haseeb Younas\AppData\Local\Programs\Python\Python35\lib\site-packages\cairocffi\init__.py", line 45, in dlopen raise OSError(error_message) # pragma: no cover OSError: no library called "cairo" was found no library called "libcairo-2" was found cannot load library 'libcairo.so': error 0x7e cannot load library 'libcairo.2.dylib': error 0x7e cannot load library 'libcairo-2.dll': error 0x7e

would you please help me to solve this problem?

ghost commented 4 years ago

Respected Sir, I am attempting to reproduce the results in your paper but there is an error shown below. Even I am using the python and libraries versions the same as you described.

import cairocffi as cairo

File "C:\Users\Haseeb Younas\AppData\Local\Programs\Python\Python35\lib\site-packages\cairocffiinit.py", line 50, in ('libcairo.so', 'libcairo.2.dylib', 'libcairo-2.dll')) File "C:\Users\Haseeb Younas\AppData\Local\Programs\Python\Python35\lib\site-packages\cairocffiinit.py", line 45, in dlopen raise OSError(error_message) # pragma: no cover OSError: no library called "cairo" was found no library called "libcairo-2" was found cannot load library 'libcairo.so': error 0x7e cannot load library 'libcairo.2.dylib': error 0x7e cannot load library 'libcairo-2.dll': error 0x7e

would you please help me to solve this problem?

It seems that you are running this on a Windows computer. I don't think that these libraries are compatible with Windows OS. If you could try these on a Linux or MacOS device, they should work.

HaseebYounis2 commented 4 years ago

Sir, thanks for the quick reply. please let me check on Linux than I'll let you know .

HaseebYounis2 commented 4 years ago

That sounds like a suitable quick-fix.

Also, we plan to be working on fixing all these errors. We hope that it will be ready to be deployed soon.

@bellstwohearted thanks for the support it worked on the Linux but I am getting the same error that others are facing in this thread. @tuncadogan Sir, May you please tell me how much time It will take to solve these errors.

tuncadogan commented 4 years ago

That sounds like a suitable quick-fix. Also, we plan to be working on fixing all these errors. We hope that it will be ready to be deployed soon.

@bellstwohearted thanks for the support it worked on the Linux but I am getting the same error that others are facing in this thread. @tuncadogan Sir, May you please tell me how much time It will take to solve these errors.

We anticipate to finish it just around 2 or 2 and a half weeks. But if you are in a hurry, you can follow the steps in our readme to train a DEEPScreen classifier for your target protein of interest.

@bellstwohearted thank you for your help to @HaseebYounis2

HaseebYounis2 commented 4 years ago

That sounds like a suitable quick-fix. Also, we plan to be working on fixing all these errors. We hope that it will be ready to be deployed soon.

@bellstwohearted thanks for the support it worked on the Linux but I am getting the same error that others are facing in this thread. @tuncadogan Sir, May you please tell me how much time It will take to solve these errors.

We anticipate to finish it just around 2 or 2 and a half weeks. But if you are in a hurry, you can follow the steps in our readme to train a DEEPScreen classifier for your target protein of interest.

@bellstwohearted thank you for your help to @HaseebYounis2

I also have passed through the training process and it is also generating the same error as other people have reported. So, for now I am moving toward the data preprocessing and curation step of this paper. I'll wait for the updated data from your side. Thanks a lot.

ahmetrifaioglu commented 4 years ago

Hi, We are sorry for the delay. We are trying our best to update the system. We had to made some major changes to create a new version. We are planning to put the new implementation until this Friday. I will give an update once we finish the initial development and release the code. Best

ahmetrifaioglu commented 4 years ago

Hi,

We are sorry for the late response again. It is quite busy and hectic times for us and we had to do some major changes in the implementation of DEEPScreen as I mentioned before. The main change is that we decided not to proceed with the tflearn as the version that we had used became too old (it has been almost 4 years since we started this project) and we encountered other problems and incompatibilities among the new versions of libraries when we want to do some changes. Some others also reported installation problems.

For these reasons, DEEPScreen has been re-implemented using PyTorch. We created all the training/test/validation images for all targets in order to avoid the image size, quality and library issues. So, you can use the readily available images to train models for the targets. The new version has been tested on MacOSx and Linux. Unfortunately, we have not yet been able to work on CNN architectures in detail and create models for each target as it is required to perform hyper-parameter search for all of the targets separately. But we are planning to work on it next.

Here is the summary of the new changes: The implementation was done using the latest version of all libraries (PyTorch, RDkit etc.) The filtered and preprocessed dataset was updated using ChEMBL version 27. The number of targets increased from 704 to 812 with the updated training datasets. Training, validation and test images were created for each target.

Here is the things that we are planning to do next: Adding other CNN architectures will be added such as InceptionV3 for training Performing hyperparameter search and generating target-specific models Developing scripts for easy testing using the generated models

I am closing this issue now. Please let us any problems that you encounter.

Best

ghost commented 4 years ago

@ahmetrifaioglu Thank you for the updates. When I am trying to run the example you give, I get an error, saying

Namespace(bs=64, dropout=0.25, en='my_chembl286_training', epoch=100, fc1=256, fc2=128, lr=0.01, model='CNNModel1', targetid='CHEMBL286')
Arguments: CHEMBL286-CNNModel1-256-128-0.01-64-0.25-100-my_chembl286_training
GPU is available on this device!
Epoch :0
Training mode: True
Epoch 0 training loss: 23.89242261648178
Traceback (most recent call last):
  File "main_training.py", line 67, in <module>
    args.dropout, args.epoch, args.en)
  File "/data/ai_robot/DEEPScreen/bin/train_deepscreen.py", line 143, in train_validation_test_training
    training_perf_dict = prec_rec_f1_acc_mcc(all_training_labels, np.array(all_training_preds))
  File "/data/ai_robot/DEEPScreen/bin/evaluation_metrics.py", line 10, in prec_rec_f1_acc_mcc
    precision = metrics.precision_score(y_true, y_pred)
  File "/data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.7/site-packages/sklearn/utils/validation.py", line 73, in inner_f
    return f(**kwargs)
  File "/data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1623, in precision_score
    zero_division=zero_division)
  File "/data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.7/site-packages/sklearn/utils/validation.py", line 73, in inner_f
    return f(**kwargs)
  File "/data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1434, in precision_recall_fscore_support
    pos_label)
  File "/data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1250, in _check_set_wise_labels
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "/data/ai_robot/Anaconda3/envs/deepscreen/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 98, in _check_targets
    raise ValueError("{0} is not supported".format(y_type))
ValueError: unknown is not supported

It seems that some data type errors occur at

training_perf_dict = prec_rec_f1_acc_mcc(all_training_labels, np.array(all_training_preds))

Any suggestions?

ahmetrifaioglu commented 4 years ago

@bellstwohearted We tried to reproduce the error on four different machines having Linux and MacOSx operating systems with multiple trials but we could not reproduce the error. I also searched the error and I could not find a clear answer. Are you using the same versions of libraries? If so, the only thing that comes to my mind is that there were no true predictions at epoch 1. I have now added exception handling for the performance calculation. This should resolve the error, if the error occurred due to no true predictions in the first epochs.

reemlores commented 3 years ago

I got the same error: X_data = X_data.reshape(203,10,200, 200,3) ValueError: cannot reshape array of size 398400000 into shape (203,10,200,200,3) Please help!

tuncadogan commented 3 years ago

We have a new version coded in pytorch, in this version we discard the whole image generation module due to these errors. We directly feed the system with pre-generated images. I believe you are using the old version of our tool. Please switch to pytorch branch and follow instructions on training/testing a model. Let me know if you have further questions.