tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone
https://tensorflow.org
Apache License 2.0
186.38k stars 74.31k forks source link

TypeError: can't pickle _thread.RLock objects: SKlearn/Tensorflow/Third Party content interference? #42641

Closed tolandwehr closed 3 years ago

tolandwehr commented 4 years ago

For Neural-Network regression prediction task cross_val_predict from SKlearn throws the error (full error further below, below the model used)

TypeError: can't pickle _thread.RLock objects

when utilizing it with input data of form

X_train.shape=1200,18,15 
y_train.shape=1200,18,1 

and the following in NN

def twds_model(layer1=32, layer2=32, layer3=16, dropout_rate=0.5, optimizer='Adam'
                    , learning_rate=0.001, activation='relu', loss='mse'):#, n_jobs=1):layer3=80, 

    model = Sequential()
    model.add(Bidirectional(GRU(layer1, return_sequences=True),input_shape=(X_train.shape[1],X_train.shape[2])))
    model.add(AveragePooling1D(2))
    model.add(Conv1D(layer2, 3, activation=activation, padding='same', 
               name='extractor'))
    model.add(Flatten())
    model.add(Dense(layer3,activation=activation))
    model.add(Dropout(dropout_rate))
    model.add(Dense(1))
    model.compile(optimizer=optimizer,loss=loss)
    return model

twds_model=twds_model()
print(twds_model.summary())
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bidirectional_4 (Bidirection (None, 18, 64)            9216      
_________________________________________________________________
average_pooling1d_1 (Average (None, 9, 64)             0         
_________________________________________________________________
extractor (Conv1D)           (None, 9, 32)             6176      
_________________________________________________________________
flatten_1 (Flatten)          (None, 288)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 16)                4624      
_________________________________________________________________
dropout_4 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 17        
=================================================================
Total params: 20,033
Trainable params: 20,033
Non-trainable params: 0
_________________________________________________________________
None

and

model_twds=KerasRegressor(build_fn=twds_model, batch_size=144,epochs=6)#12

The error was mentioned at SKlearn, but it was contemplated that the error is likely to be on the Tensorflow site or that it might be caused by third party content - however, I have no idea where this content might be involved.

The complete error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-603-37b55dfd53fd> in <module>
----> 1 GridLSTM.fit(X_train, y_train)

~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
    679         n_splits = cv.get_n_splits(X, y, groups)
    680 
--> 681         base_estimator = clone(self.estimator)
    682 
    683         parallel = Parallel(n_jobs=self.n_jobs, verbose=self.verbose,

~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\base.py in clone(estimator, safe)
     85     new_object_params = estimator.get_params(deep=False)
     86     for name, param in new_object_params.items():
---> 87         new_object_params[name] = clone(param, safe=False)
     88     new_object = klass(**new_object_params)
     89     params_set = new_object.get_params(deep=False)

~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\base.py in clone(estimator, safe)
     69     elif not hasattr(estimator, 'get_params') or isinstance(estimator, type):
     70         if not safe:
---> 71             return copy.deepcopy(estimator)
     72         else:
     73             if isinstance(estimator, type):

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    178                     y = x
    179                 else:
--> 180                     y = _reconstruct(x, memo, *rv)
    181 
    182     # If is its own copy, don't memoize.

~\Anaconda3\envs\Tensorflow\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    278     if state is not None:
    279         if deep:
--> 280             state = deepcopy(state, memo)
    281         if hasattr(y, '__setstate__'):
    282             y.__setstate__(state)

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
    238     memo[id(x)] = y
    239     for key, value in x.items():
--> 240         y[deepcopy(key, memo)] = deepcopy(value, memo)
    241     return y
    242 d[dict] = _deepcopy_dict

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_list(x, memo, deepcopy)
    213     append = y.append
    214     for a in x:
--> 215         append(deepcopy(a, memo))
    216     return y
    217 d[list] = _deepcopy_list

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    178                     y = x
    179                 else:
--> 180                     y = _reconstruct(x, memo, *rv)
    181 
    182     # If is its own copy, don't memoize.

~\Anaconda3\envs\Tensorflow\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    278     if state is not None:
    279         if deep:
--> 280             state = deepcopy(state, memo)
    281         if hasattr(y, '__setstate__'):
    282             y.__setstate__(state)

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
    238     memo[id(x)] = y
    239     for key, value in x.items():
--> 240         y[deepcopy(key, memo)] = deepcopy(value, memo)
    241     return y
    242 d[dict] = _deepcopy_dict

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_list(x, memo, deepcopy)
    213     append = y.append
    214     for a in x:
--> 215         append(deepcopy(a, memo))
    216     return y
    217 d[list] = _deepcopy_list

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    178                     y = x
    179                 else:
--> 180                     y = _reconstruct(x, memo, *rv)
    181 
    182     # If is its own copy, don't memoize.

~\Anaconda3\envs\Tensorflow\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    278     if state is not None:
    279         if deep:
--> 280             state = deepcopy(state, memo)
    281         if hasattr(y, '__setstate__'):
    282             y.__setstate__(state)

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
    238     memo[id(x)] = y
    239     for key, value in x.items():
--> 240         y[deepcopy(key, memo)] = deepcopy(value, memo)
    241     return y
    242 d[dict] = _deepcopy_dict

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    178                     y = x
    179                 else:
--> 180                     y = _reconstruct(x, memo, *rv)
    181 
    182     # If is its own copy, don't memoize.

~\Anaconda3\envs\Tensorflow\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    278     if state is not None:
    279         if deep:
--> 280             state = deepcopy(state, memo)
    281         if hasattr(y, '__setstate__'):
    282             y.__setstate__(state)

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
    238     memo[id(x)] = y
    239     for key, value in x.items():
--> 240         y[deepcopy(key, memo)] = deepcopy(value, memo)
    241     return y
    242 d[dict] = _deepcopy_dict

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    178                     y = x
    179                 else:
--> 180                     y = _reconstruct(x, memo, *rv)
    181 
    182     # If is its own copy, don't memoize.

~\Anaconda3\envs\Tensorflow\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    278     if state is not None:
    279         if deep:
--> 280             state = deepcopy(state, memo)
    281         if hasattr(y, '__setstate__'):
    282             y.__setstate__(state)

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
    238     memo[id(x)] = y
    239     for key, value in x.items():
--> 240         y[deepcopy(key, memo)] = deepcopy(value, memo)
    241     return y
    242 d[dict] = _deepcopy_dict

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    178                     y = x
    179                 else:
--> 180                     y = _reconstruct(x, memo, *rv)
    181 
    182     # If is its own copy, don't memoize.

~\Anaconda3\envs\Tensorflow\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    278     if state is not None:
    279         if deep:
--> 280             state = deepcopy(state, memo)
    281         if hasattr(y, '__setstate__'):
    282             y.__setstate__(state)

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    148     copier = _deepcopy_dispatch.get(cls)
    149     if copier:
--> 150         y = copier(x, memo)
    151     else:
    152         try:

~\Anaconda3\envs\Tensorflow\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
    238     memo[id(x)] = y
    239     for key, value in x.items():
--> 240         y[deepcopy(key, memo)] = deepcopy(value, memo)
    241     return y
    242 d[dict] = _deepcopy_dict

~\Anaconda3\envs\Tensorflow\lib\copy.py in deepcopy(x, memo, _nil)
    167                     reductor = getattr(x, "__reduce_ex__", None)
    168                     if reductor:
--> 169                         rv = reductor(4)
    170                     else:
    171                         reductor = getattr(x, "__reduce__", None)

TypeError: can't pickle _thread.RLock objects

The versions used:

Package                  Version
------------------------ ---------------
-                        nsorflow-gpu
-ensorflow-gpu           2.3.0
-rotobuf                 3.11.3
absl-py                  0.9.0
antlr4-python3-runtime   4.8
asn1crypto               1.3.0
astor                    0.7.1
astropy                  3.2.1
astunparse               1.6.3
attrs                    19.3.0
audioread                2.1.8
autopep8                 1.5.3
backcall                 0.1.0
beautifulsoup4           4.9.0
bezier                   0.8.0
bkcharts                 0.2
bleach                   3.1.4
blis                     0.2.4
bokeh                    1.1.0
boto3                    1.9.253
botocore                 1.12.253
Bottleneck               1.3.2
cachetools               4.1.0
certifi                  2020.4.5.1
cffi                     1.14.0
chardet                  3.0.4
click                    6.7
cloudpickle              0.5.3
cmdstanpy                0.4.0
color                    0.1
colorama                 0.4.3
colorcet                 0.9.1
convertdate              2.2.1
copulas                  0.2.5
cryptography             2.8
ctgan                    0.2.1
cycler                   0.10.0
cymem                    2.0.2
Cython                   0.29.17
dash                     0.26.0
dash-core-components     0.27.2
dash-html-components     0.11.0
dash-renderer            0.13.2
dask                     0.18.1
dataclasses              0.6
datashader               0.7.0
datashape                0.5.2
datawig                  0.1.10
deap                     1.3.0
decorator                4.4.2
defusedxml               0.6.0
deltapy                  0.1.1
dill                     0.2.9
distributed              1.22.1
docutils                 0.14
entrypoints              0.3
ephem                    3.7.7.1
et-xmlfile               1.0.1
exrex                    0.10.5
Faker                    4.0.3
fastai                   1.0.60
fastprogress             0.2.2
fbprophet                0.6
fire                     0.3.1
Flask                    1.0.2
Flask-Compress           1.4.0
future                   0.17.1
gast                     0.3.3
geojson                  2.4.1
geomet                   0.2.0.post2
google-auth              1.14.0
google-auth-oauthlib     0.4.1
google-pasta             0.2.0
gplearn                  0.4.1
graphviz                 0.13.2
grpcio                   1.29.0
h5py                     2.10.0
HeapDict                 1.0.0
holidays                 0.10.2
holoviews                1.12.1
html2text                2018.1.9
hyperas                  0.4.1
hyperopt                 0.1.2
idna                     2.6
imageio                  2.5.0
imbalanced-learn         0.3.3
imblearn                 0.0
importlib-metadata       1.5.0
impyute                  0.0.8
ipykernel                5.1.4
ipython                  7.13.0
ipython-genutils         0.2.0
ipywidgets               7.5.1
itsdangerous             0.24
jdcal                    1.4
jedi                     0.16.0
Jinja2                   2.11.1
jmespath                 0.9.5
joblib                   0.13.2
jsonschema               3.2.0
jupyter                  1.0.0
jupyter-client           6.1.2
jupyter-console          6.0.0
jupyter-core             4.6.3
Keras                    2.4.3
Keras-Applications       1.0.8
Keras-Preprocessing      1.1.2
keras-rectified-adam     0.17.0
kiwisolver               1.2.0
korean-lunar-calendar    0.2.1
librosa                  0.7.2
llvmlite                 0.32.1
lml                      0.0.1
locket                   0.2.0
LunarCalendar            0.0.9
Markdown                 2.6.11
MarkupSafe               1.1.1
matplotlib               3.2.1
missingpy                0.2.0
mistune                  0.8.4
mkl-fft                  1.0.15
mkl-random               1.1.0
mkl-service              2.3.0
mock                     4.0.2
msgpack                  0.5.6
multipledispatch         0.6.0
murmurhash               1.0.2
mxnet                    1.4.1
nb-conda                 2.2.1
nb-conda-kernels         2.2.3
nbconvert                5.6.1
nbformat                 5.0.4
nbstripout               0.3.7
networkx                 2.1
notebook                 6.0.3
numba                    0.49.1
numexpr                  2.7.1
numpy                    1.18.5
oauthlib                 3.1.0
olefile                  0.46
opencv-python            4.2.0.34
openpyxl                 2.5.5
opt-einsum               3.2.1
packaging                20.3
pandas                   1.0.3
pandasvault              0.0.3
pandocfilters            1.4.2
param                    1.9.0
parso                    0.6.2
partd                    0.3.8
patsy                    0.5.1
pbr                      5.1.3
pickleshare              0.7.5
Pillow                   7.0.0
pip                      20.2.2
plac                     0.9.6
plotly                   4.7.1
plotly-express           0.4.1
preshed                  2.0.1
prometheus-client        0.7.1
prompt-toolkit           3.0.4
protobuf                 3.11.3
psutil                   5.4.7
py                       1.8.0
pyasn1                   0.4.8
pyasn1-modules           0.2.8
pycodestyle              2.6.0
pycparser                2.20
pyct                     0.4.5
pyensae                  1.3.839
pyexcel                  0.5.8
pyexcel-io               0.5.7
Pygments                 2.6.1
pykalman                 0.9.5
PyMeeus                  0.3.7
pymongo                  3.8.0
pyOpenSSL                19.1.0
pyparsing                2.4.7
pypi                     2.1
pyquickhelper            1.9.3418
pyrsistent               0.16.0
PySocks                  1.7.1
pystan                   2.19.1.1
python-dateutil          2.8.1
pytz                     2019.3
pyviz-comms              0.7.2
PyWavelets               0.5.2
pywin32                  227
pywinpty                 0.5.7
PyYAML                   5.3.1
pyzmq                    18.1.1
qtconsole                4.4.4
rdt                      0.2.1
RegscorePy               1.1
requests                 2.23.0
requests-oauthlib        1.3.0
resampy                  0.2.2
retrying                 1.3.3
rsa                      4.0
s3transfer               0.2.1
scikit-image             0.15.0
scikit-learn             0.23.2
scipy                    1.4.1
sdv                      0.3.2
seaborn                  0.9.0
seasonal                 0.3.1
Send2Trash               1.5.0
sentinelsat              0.12.2
setuptools               46.3.0
setuptools-git           1.2
six                      1.14.0
sklearn                  0.0
sortedcontainers         2.0.4
SoundFile                0.10.3.post1
soupsieve                2.0
spacy                    2.1.8
srsly                    0.1.0
statsmodels              0.9.0
stopit                   1.1.2
sugartensor              1.0.0.2
ta                       0.5.25
tb-nightly               1.14.0a20190603
tblib                    1.3.2
tensorboard              2.3.0
tensorboard-plugin-wit   1.7.0
tensorflow-gpu           2.3.0
tensorflow-gpu-estimator 2.3.0
termcolor                1.1.0
terminado                0.8.3
testpath                 0.4.4
text-unidecode           1.3
texttable                1.4.0
Theano                   1.0.4
thinc                    7.0.8
threadpoolctl            2.1.0
toml                     0.10.1
toolz                    0.10.0
torch                    1.4.0
torchvision              0.5.0
tornado                  6.0.4
TPOT                     0.10.2
tqdm                     4.45.0
traitlets                4.3.3
transforms3d             0.3.1
tsaug                    0.2.1
typeguard                2.7.1
typing                   3.6.6
update-checker           0.16
urllib3                  1.22
utm                      0.4.2
wasabi                   0.2.2
wcwidth                  0.1.9
webencodings             0.5.1
Werkzeug                 1.0.1
wheel                    0.34.2
widgetsnbextension       3.5.1
win-inet-pton            1.1.0
wincertstore             0.2
wrapt                    1.11.2
xarray                   0.10.8
xlrd                     1.1.0
yahoo-historical         0.3.2
zict                     0.1.3
zipp                     2.2.0
amahendrakar commented 4 years ago

@tolandwehr, I was able to run the given code snippet without any issues, please find the gist of it here.

In order to expedite the trouble-shooting process, could you please provide the complete code to reproduce the issue reported here. Thanks!

tolandwehr commented 4 years ago

Okay. Gist is a little short, as it misses the

cross_val_predict

, where the error happens. My description might have left this unclear, sry. Here the notebook, which is a little messy, though, as it also involves other tasks:

import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = '0' # Set to -1 if CPU should be used CPU = -1 , GPU = 0

gpus = tf.config.experimental.list_physical_devices('GPU')
cpus = tf.config.experimental.list_physical_devices('CPU')

if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)
elif cpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        logical_cpus= tf.config.experimental.list_logical_devices('CPU')
        print(len(cpus), "Physical CPU,", len(logical_cpus), "Logical CPU")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)

#from __future__ import print_function, division

import plotly.express as px
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import keras
import sys
import os

import tsaug
from tsaug.visualization import plot
from tsaug import TimeWarp, Crop, Quantize, Drift, Reverse, Convolve, AddNoise, Dropout, Pool, Resize

import statsmodels
import datawig
import impyute

import missingpy
from missingpy import KNNImputer,MissForest

from impyute.imputation.cs import mice
from datawig import SimpleImputer
from statsmodels import robust
from operator import itemgetter,attrgetter
from functools import partial
from scipy import stats

from pylab import rcParams
from tpot import TPOTRegressor

from sklearn import preprocessing
from sklearn.decomposition import PCA
from sklearn import model_selection
from sklearn.preprocessing import StandardScaler, MinMaxScaler, Normalizer, LabelEncoder, RobustScaler, QuantileTransformer
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split,cross_val_score, cross_val_predict, cross_validate, GridSearchCV, RandomizedSearchCV, TimeSeriesSplit, KFold
from sklearn.metrics import mean_squared_error, r2_score, explained_variance_score, make_scorer,median_absolute_error, mean_absolute_error,max_error,explained_variance_score
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.utils.validation import check_array, check_is_fitted
from sklearn.experimental import enable_iterative_imputer  # noqa
from sklearn.impute import IterativeImputer, SimpleImputer

from tensorflow.keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
from kerastuner import HyperModel

# import tensorflow as tf
# import cProfile
# tf.enable_eager_execution()
# tf.executing_eagerly()

from tensorflow.python.keras.layers import InputLayer, TimeDistributed, Lambda, Dense, Dot, Reshape,Concatenate, Embedding, Activation, Conv1D, Conv2D, Cropping2D, MaxPooling2D, Flatten, Dropout, LSTM, GRU, Bidirectional, Input, LeakyReLU,Conv2DTranspose, ZeroPadding2D, ZeroPadding1D, UpSampling2D, UpSampling1D,multiply,AveragePooling1D # components of network
from tensorflow.python.keras.models import Model, Sequential # type of model
from tensorflow.python.keras.layers import BatchNormalization
from tensorflow.python.keras.optimizers import Adam, RMSprop, SGD, Nadam, Adadelta, Adamax
from tensorflow.python.keras.regularizers import l2
from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard

#from SpectralNormalizationKeras import DenseSN, ConvSN2D

from ctgan import CTGANSynthesizer

#import tensorflow.keras.backend as K
from tensorflow.python.keras.backend import expand_dims, squeeze

from tqdm import tqdm

######loading and preparing data

Heisei_T=pd.read_excel('Heisei_Whoa.xlsx')
Heisei_T=Heisei_T.drop(Heisei_T.columns[0], axis=1)

Year_Frame = 18
Set_Len = len(Heisei_T)
Sample_Len = int(Set_Len/Year_Frame)
Set_Width = len(Heisei_T.columns)

Heisei_Heads=Heisei_T.columns
Heisei_TR=np.array(Heisei_T).reshape(Sample_Len,Year_Frame,Set_Width)
np.random.shuffle(Heisei_TR)
Heisei_T=Heisei_TR.reshape(Sample_Len*Year_Frame,Set_Width)

def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the dataset
        if end_ix > len(sequences):
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
        X.append(seq_x)
        y.append(seq_y)
    return np.array(X), np.array(y)

n_steps = 3

X_full=[]
y_full=[]

for x in range(0,Sample_Len):
    X,y=split_sequences(Heisei_T[x*Year_Frame:(x+1)*Year_Frame,:],n_steps)
    X_full.append(X)
    y_full.append(y)

X_full=np.array(X_full)
y_full=np.array(y_full)

X_full.shape
X_full=X_full.reshape(Sample_Len*16,n_steps,Set_Width-1)
y_full=y_full.reshape(Sample_Len*16,1)
#pd.DataFrame(y_full).head(90)

X_full.shape
#y_full.shape

print(pd.DataFrame(y_full).max())

y_test=np.array(pd.DataFrame(y_full).tail(round(0.10001671123*y_full.shape[0])))
y_train=np.array(pd.DataFrame(y_full).head(round((1-0.10001671123)*y_full.shape[0])))

X_test=np.array(pd.DataFrame(X_full.reshape(X_full.shape[0],X_full.shape[1]*X_full.shape[2])).tail(round(0.10001671123*X_full.shape[0]))).reshape(round(0.10001671123*X_full.shape[0]),X_full.shape[1],X_full.shape[2])
X_train=np.array(pd.DataFrame(X_full.reshape(X_full.shape[0],X_full.shape[1]*X_full.shape[2])).head(round(1-0.10001671123*X_full.shape[0]-1))).reshape(round((1-0.10001671123)*X_full.shape[0]),X_full.shape[1],X_full.shape[2])

Train_Len=len(X_train)
Test_Len=len(X_test)

print(y_train.shape,y_test.shape,X_train.shape,X_test.shape)

X_test_scaler = preprocessing.StandardScaler()
y_test_scaler = preprocessing.StandardScaler()
X_train_scaler = preprocessing.StandardScaler()
y_train_scaler = preprocessing.StandardScaler()

y_test=y_test_scaler.fit_transform(y_test)
y_train=y_train_scaler.fit_transform(y_train)

X_test=X_test_scaler.fit_transform(X_test.reshape(Test_Len*n_steps,Set_Width-1))
X_train=X_train_scaler.fit_transform(X_train.reshape(Train_Len*n_steps,Set_Width-1))

X_test=X_test.reshape(Test_Len,n_steps,Set_Width-1)
X_train=X_train.reshape(Train_Len,n_steps,Set_Width-1)

#########Model and SKlearn Cross_Val_Predict

def twds_model(layer1=32, layer2=32, layer3=16, dropout_rate=0.5, optimizer='Adam'
                    , learning_rate=0.001, activation='relu', loss='mse'):#, n_jobs=1):layer3=80, 

    model = Sequential()
    model.add(Bidirectional(GRU(layer1, return_sequences=True),input_shape=(X_train.shape[1],X_train.shape[2])))
    model.add(AveragePooling1D(2))
    model.add(Conv1D(layer2, 3, activation=activation, padding='same', 
               name='extractor'))
    model.add(Flatten())
    model.add(Dense(layer3,activation=activation))
    model.add(Dropout(dropout_rate))
    model.add(Dense(1))
    model.compile(optimizer=optimizer,loss=loss)
    return model

twds_model=twds_model()
print(twds_model.summary())

def CustomVarious(y_true, y_pred):
    y_true=y_true.reshape(len(y_true[:,1])*Heisei_TR.shape[1],)

    if np.isnan(y_pred).any():
        result=-1000000
        MAD= 1000000
    else:
        y_pred=y_pred.reshape(len(y_pred[:,1])*Heisei_TR.shape[1],)
        MAD=median_absolute_error(y_true, y_pred)

        print(MAD)

    return MAD

scorer = make_scorer(CustomVarious, greater_is_better=False

model_twds=KerasRegressor(build_fn=twds_Model, batch_size=256,epochs=6)

############# PLACE OF THE ERROR ############
twds_Pred=cross_val_predict(model_twds, 
               X_train, 
               y_train, 
               n_jobs=1, 
               cv=4, 
               verbose=2)
amahendrakar commented 4 years ago

@tolandwehr, On running the code I am facing an error stating FileNotFoundError: [Errno 2] No such file or directory: 'Heisei_Whoa.xlsx'. Could you please provide all the necessary files to run the code?

Also, could you please remove the dependencies and get the example down to the simplest possible repro? That will allow us to easily debug the issue. Thanks!

tolandwehr commented 4 years ago

@amahendrakar

Whoa contains data that I'm not allowed to pass, unfortunately ^^'. But it was checked with

.isnull().sum().sum()

to be free on NaNs.

You can dense the code down to:

import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = '0' # Set to -1 if CPU should be used CPU = -1 , GPU = 0

gpus = tf.config.experimental.list_physical_devices('GPU')
cpus = tf.config.experimental.list_physical_devices('CPU')

if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)
elif cpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        logical_cpus= tf.config.experimental.list_logical_devices('CPU')
        print(len(cpus), "Physical CPU,", len(logical_cpus), "Logical CPU")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)

#from __future__ import print_function, division

import plotly.express as px
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import keras
import sys
import os

import tsaug
from tsaug.visualization import plot
from tsaug import TimeWarp, Crop, Quantize, Drift, Reverse, Convolve, AddNoise, Dropout, Pool, Resize

import statsmodels
import datawig
import impyute

import missingpy
from missingpy import KNNImputer,MissForest

from impyute.imputation.cs import mice
from datawig import SimpleImputer
from statsmodels import robust
from operator import itemgetter,attrgetter
from functools import partial
from scipy import stats

from pylab import rcParams
from tpot import TPOTRegressor

from sklearn import preprocessing
from sklearn.decomposition import PCA
from sklearn import model_selection
from sklearn.preprocessing import StandardScaler, MinMaxScaler, Normalizer, LabelEncoder, RobustScaler, QuantileTransformer
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split,cross_val_score, cross_val_predict, cross_validate, GridSearchCV, RandomizedSearchCV, TimeSeriesSplit, KFold
from sklearn.metrics import mean_squared_error, r2_score, explained_variance_score, make_scorer,median_absolute_error, mean_absolute_error,max_error,explained_variance_score
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.utils.validation import check_array, check_is_fitted
from sklearn.experimental import enable_iterative_imputer  # noqa
from sklearn.impute import IterativeImputer, SimpleImputer

from tensorflow.keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
from kerastuner import HyperModel

from tensorflow.python.keras.layers import InputLayer, TimeDistributed, Lambda, Dense, Dot, Reshape,Concatenate, Embedding, Activation, Conv1D, Conv2D, Cropping2D, MaxPooling2D, Flatten, Dropout, LSTM, GRU, Bidirectional, Input, LeakyReLU,Conv2DTranspose, ZeroPadding2D, ZeroPadding1D, UpSampling2D, UpSampling1D,multiply,AveragePooling1D # components of network
from tensorflow.python.keras.models import Model, Sequential # type of model
from tensorflow.python.keras.layers import BatchNormalization
from tensorflow.python.keras.optimizers import Adam, RMSprop, SGD, Nadam, Adadelta, Adamax
from tensorflow.python.keras.regularizers import l2
from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard

from ctgan import CTGANSynthesizer

#import tensorflow.keras.backend as K
from tensorflow.python.keras.backend import expand_dims, squeeze

from tqdm import tqdm

X_full = np.random.rand(1200,18,15)
y_full = np.random.rand(1200,18 )

y_test=np.array(pd.DataFrame(y_full).tail(round(0.10001671123*y_full.shape[0])))
y_train=np.array(pd.DataFrame(y_full).head(round((1-0.10001671123)*y_full.shape[0])))

X_test=np.array(pd.DataFrame(X_full.reshape(X_full.shape[0],X_full.shape[1]*X_full.shape[2])).tail(round(0.10001671123*X_full.shape[0]))).reshape(round(0.10001671123*X_full.shape[0]),X_full.shape[1],X_full.shape[2])
X_train=np.array(pd.DataFrame(X_full.reshape(X_full.shape[0],X_full.shape[1]*X_full.shape[2])).head(round(1-0.10001671123*X_full.shape[0]-1))).reshape(round((1-0.10001671123)*X_full.shape[0]),X_full.shape[1],X_full.shape[2])

Train_Len=len(X_train)
Test_Len=len(X_test)

print(y_train.shape,y_test.shape,X_train.shape,X_test.shape)

X_test_scaler = preprocessing.StandardScaler()
y_test_scaler = preprocessing.StandardScaler()
X_train_scaler = preprocessing.StandardScaler()
y_train_scaler = preprocessing.StandardScaler()

y_test=y_test_scaler.fit_transform(y_test)
y_train=y_train_scaler.fit_transform(y_train)

X_test=X_test_scaler.fit_transform(X_test.reshape(Test_Len*n_steps,Set_Width-1))
X_train=X_train_scaler.fit_transform(X_train.reshape(Train_Len*n_steps,Set_Width-1))

X_test=X_test.reshape(Test_Len,n_steps,Set_Width-1)
X_train=X_train.reshape(Train_Len,n_steps,Set_Width-1)

#########Model and SKlearn Cross_Val_Predict

def twds_model(layer1=32, layer2=32, layer3=16, dropout_rate=0.5, optimizer='Adam'
                    , learning_rate=0.001, activation='relu', loss='mse'):#, n_jobs=1):layer3=80, 

    model = Sequential()
    model.add(Bidirectional(GRU(layer1, return_sequences=True),input_shape=(X_train.shape[1],X_train.shape[2])))
    model.add(AveragePooling1D(2))
    model.add(Conv1D(layer2, 3, activation=activation, padding='same', 
               name='extractor'))
    model.add(Flatten())
    model.add(Dense(layer3,activation=activation))
    model.add(Dropout(dropout_rate))
    model.add(Dense(1))
    model.compile(optimizer=optimizer,loss=loss)
    return model

twds_model=twds_model()
print(twds_model.summary())

def CustomVarious(y_true, y_pred):
    y_true=y_true.reshape(len(y_true[:,1])*Heisei_TR.shape[1],)

    if np.isnan(y_pred).any():
        result=-1000000
        MAD= 1000000
    else:
        y_pred=y_pred.reshape(len(y_pred[:,1])*Heisei_TR.shape[1],)
        MAD=median_absolute_error(y_true, y_pred)

        print(MAD)

    return MAD

scorer = make_scorer(CustomVarious, greater_is_better=False)

model_twds=KerasRegressor(build_fn=twds_Model, batch_size=256,epochs=6)

############# PLACE OF THE ERROR ############
twds_Pred=cross_val_predict(model_twds, 
               X_train, 
               y_train, 
               n_jobs=1, 
               cv=4, 
               verbose=2)
tolandwehr commented 4 years ago

Or actually you could leave all the data preshaping and just feed in with

X_train = np.random.rand(1200,18,15)
X_train = np.random.rand(1200,18,1 )
amahendrakar commented 4 years ago

@tolandwehr Thank you for the update. I was able to reproduce the issue with TF v2.3.

Whereas on running the code with TF-nightly, I am facing a different error stating ValueError: The first argument to Layer.call must always be passed. Please find the attached gist. Thanks!

amahendrakar commented 4 years ago
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-603-37b55dfd53fd> in <module>
----> 1 GridLSTM.fit(X_train, y_train)

~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

@tolandwehr, Looking at the error log, it seems like the error is thrown from a sklearn\utils\validation.py module and Tensorflow is the name of your environment. Please correct me if I am wrong, thanks!

tolandwehr commented 4 years ago

@amahendrakar

Exactly, the environ was called 'Tensorflow', the module is 'sklearn\utils\validation.py'.

amahendrakar commented 4 years ago

@gowthamkpr, Running the code with TF v2.3, throws an error stating TypeError: can't pickle _thread.RLock objects.

However with TF-nightly, the error changes to ValueError: The first argument to Layer.call must always be passed.. Please find the attached gist. Thanks!

rmothukuru commented 4 years ago

@tolandwehr, Can you please refer this Article which comprises Using Keras Wrapper in Scikit Learn. Thanks!

rmothukuru commented 3 years ago

@tolandwehr, Can you please respond to the above comment. Thanks!

tolandwehr commented 3 years ago

@rmothukuru Sry, the last weeks are very busy on releasing some topics, I will come back to the issue next week.

rmothukuru commented 3 years ago

@tolandwehr, Can you please respond to the above comment. Thanks!

rmothukuru commented 3 years ago

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

google-ml-butler[bot] commented 3 years ago

Are you satisfied with the resolution of your issue? Yes No

anoldmaninthesea commented 3 years ago

I've just opened this related issue, but now with data freely available. https://github.com/tensorflow/tensorflow/issues/47324