Binary Classification yields Runtime Error

I am struggling with a relatively easy binary classification problem, similar as in #613 . I'm trying to use the MlpModule of Autokeras to generate Multi-Layer-Perceptron architectures. I apply data transformations to guarantee the correct format: I've tried two possibilities: One output node and binary target labels and two output nodes and one-hot-encoded target labels. Both yield the same / similar results: An error message about the input not being in adequate range.

This is my code:

    from keras.utils import to_categorical

    from autokeras import MlpModule
    from autokeras.nn.metric import Accuracy
    from autokeras.backend.torch import DataTransformerMlp
    from torch.nn.modules.loss import BCELoss

    def binary_cross_entropy(prediction, target):
        return BCELoss()(prediction, target.float())

    # X, test_X, y, test_y are set with Pandas (this is why, below '.values' is used)
    y = to_categorical(y)
    test_y = to_categorical(test_y)

    mlpModule = MlpModule(loss=binary_cross_entropy, metric=Accuracy, searcher_args={}, verbose=True)
    data_transformer = DataTransformerMlp(X.values)
    train_data = data_transformer.transform_train(X.values, y)
    test_data = data_transformer.transform_test(test_X.values, test_y)
    fit_args = {
        "n_output_node": 2,
        "input_shape": X.values.shape,
        "train_data": train_data,
        "test_data": test_data
    }
    mlpModule.fit(n_output_node=fit_args.get("n_output_node"),
                  input_shape=fit_args.get("input_shape"),
                  train_data=fit_args.get("train_data"),
                  test_data=fit_args.get("test_data"),
                  time_limit=1 * 60 * 60)

    # ...

Upon execution, I get the following error message:

RuntimeError: Assertion `x >= 0. && x <= 1.' failed. input value should be between 0~1, but got -0.268241 at ..\aten\src\THNN/generic/BCECriterion.c:62

For the sake of completeness: Here is the full log:

C:\ProgramData\Miniconda\envs\nnmp\python.exe C:\Users\{user}\.IntelliJIdea2018.2\config\plugins\python\helpers\pydev\pydevd.py --multiproc --qt-support=auto --client 127.0.0.1 --port 50878 --file "C:/Users/{user}/Documents/Academics/Uni/Georg-August-University/Computer Science/Masterarbeit/code/src/test.py"
pydev debugger: process 8364 is connecting

Connected to pydev debugger (build 182.4505.22)
Using TensorFlow backend.
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
Saving Directory: C:\Users\{user}\AppData\Local\Temp\autokeras_5OBEEF
C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\backend\torch\data_transformer.py:175: RuntimeWarning: invalid value encountered in true_divide
  data = (data - self.mean) / self.std
C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\backend\torch\data_transformer.py:175: RuntimeWarning: divide by zero encountered in true_divide
  data = (data - self.mean) / self.std

Initializing search.
Initialization finished.

+----------------------------------------------+
|               Training model 0               |
+----------------------------------------------+
Backend Qt5Agg is interactive backend. Turning interactive mode on.
Epoch-1, Current Metric - 0:   0%|                                       | 0/14 [00:00<?, ? batch/s]Traceback (most recent call last):
  File "C:\Users\{user}\.IntelliJIdea2018.2\config\plugins\python\helpers\pydev\pydevd.py", line 1664, in <module>
    main()
  File "C:\Users\{user}\.IntelliJIdea2018.2\config\plugins\python\helpers\pydev\pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Users\{user}\.IntelliJIdea2018.2\config\plugins\python\helpers\pydev\pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Users\{user}\.IntelliJIdea2018.2\config\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/{user}/Documents/Academics/Uni/Georg-August-University/Computer Science/Masterarbeit/code/src/test.py", line 33, in <module>
    time_limit=1 * 60 * 60)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\net_module.py", line 69, in fit
    self.searcher.search(train_data, test_data, int(time_remain))
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\search.py", line 162, in search
    self.sp_search(graph, other_info, model_id, train_data, test_data)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\search.py", line 196, in sp_search
    self.metric, self.loss, self.verbose, self.path)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\search.py", line 363, in train
    raise e
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\search.py", line 356, in train
    verbose=verbose).train_model(**trainer_args)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\backend\torch\model_trainer.py", line 109, in train_model
    self._train()
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\autokeras\backend\torch\model_trainer.py", line 146, in _train
    loss = self.loss_function(outputs, targets)
  File "C:/Users/{user}/Documents/Academics/Uni/Georg-August-University/Computer Science/Masterarbeit/code/src/test.py", line 11, in binary_cross_entropy
    return BCELoss()(prediction, target.float())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\torch\nn\modules\loss.py", line 512, in forward
    return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\torch\nn\functional.py", line 2113, in binary_cross_entropy
    input, target, weight, reduction_enum)
RuntimeError: Assertion `x >= 0. && x <= 1.' failed. input value should be between 0~1, but got -0.268241 at ..\aten\src\THNN/generic/BCECriterion.c:62

My labels look as following:

'count    1692.000000
mean        0.500000
std         0.500148
min         0.000000
25%         0.000000
50%         0.500000
75%         1.000000
max         1.000000
Name: y, dtype: float64'

My consists of over 700 columns, each similarly structured to the following 5 examples:


'count    1692.000000
mean        0.037825
std         0.761111
min         0.000000
25%         0.000000
50%         0.000000
75%         0.000000
max        22.000000
Name: [feature1], dtype: float64'

'count    1692.000000
mean        0.037234
std         0.746628
min         0.000000
25%         0.000000
50%         0.000000
75%         0.000000
max        22.000000
Name: [feature2], dtype: float64'

'count    1692.000000
mean        0.000000
std         0.270794
min        -5.000000
25%         0.000000
50%         0.000000
75%         0.000000
max         7.000000
Name: [feature3], dtype: float64'

keras-team / autokeras

Binary Classification yields Runtime Error #678