Closed alexv1247 closed 5 years ago
Can you post the error messages here? Without them, I cannot tell for sure.
Does your model work without modAL? Can you train it with your data? Because I don't think the 3D shape is a problem for modAL, since the data interacts with the model only. (I have tried other 3D shapes for image classification problems, they work fine.)
This is the error message fot batch_uncertainty_sampling:
Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3296, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-2-c64fd087a5a2>", line 1, in <module> runfile('C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types/standard_modAL.py', wdir='C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types') File "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types/standard_modAL.py", line 44, in <module> query_idx, query_instance = learner.query(x_pool, n_instances=20) File "C:\ProgramData\Anaconda3\lib\site-packages\modAL\models\base.py", line 194, in query query_result = self.query_strategy(self, *query_args, **query_kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\modAL\batch.py", line 197, in uncertainty_batch_sampling n_instances=n_instances, metric=metric, n_jobs=n_jobs) File "C:\ProgramData\Anaconda3\lib\site-packages\modAL\batch.py", line 150, in ranked_batch metric=metric, n_jobs=n_jobs) File "C:\ProgramData\Anaconda3\lib\site-packages\modAL\batch.py", line 82, in select_instance n_labeled_records, _ = X_training.shape ValueError: too many values to unpack (expected 2)
This is the error for expected_error_reduction:
Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3296, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-2-c64fd087a5a2>", line 1, in <module> runfile('C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types/standard_modAL.py', wdir='C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types') File "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types/standard_modAL.py", line 44, in <module> query_idx, query_instance = learner.query(x_pool, n_instances=20) File "C:\ProgramData\Anaconda3\lib\site-packages\modAL\models\base.py", line 194, in query query_result = self.query_strategy(self, *query_args, **query_kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\modAL\expected_error.py", line 64, in expected_error_reduction X_new = data_vstack((learner.X_training, x.reshape(1, -1))) File "C:\ProgramData\Anaconda3\lib\site-packages\modAL\utils\data.py", line 22, in data_vstack return np.concatenate(blocks) ValueError: all the input arrays must have same number of dimensions
Since this is the same code I used for the default query strategy and the same data I dont know how to tackle this error.
What is the type and shape of your training data? Especially x_initial_training
and x_pool
, the problem seems to be with those.
For batch sampling, it seems to be that x_initial_training
is actually a 1D array. With expected error reduction, the problem can be the same if the shape of these arrays are different. Can you check these?
x_pool is a numpy array with a shape of (31982, 10, 6) and type float. x_inital_training is a numpy array with a shape of (636, 10, 6) and type float
I'll try to figure out what went wrong soon. Not sure I can look into this during the weekend, but I'll fix this by the end of next week!
Quick update: the bug is definitely in modAL, I am preparing a fix, it will be ready soon!
The fix is in! Now these query strategies work with multidimensional data. You can update your local installation by installing directly from the master branch:
pip install git+https://github.com/modAL-python/modAL.git
Let me know if there is a problem!
A small note. Expected error reduction will only work with scikit-learn models since this requires cloning and retraining the classifier, which might not work with Keras.
I am using keras/tensorflow models with this framework and the activelearner class. As soon as I try to change the query strategy, different errors occur.
What do I have to change to implement the different strategies. The trainings_input is 3D shape. I tried up to now all uncertainty methods of which only the default selection did work. Now I was trying the expected error_reduction strategy, but there occur errors as well.
I am afraid the 3D shape of the training data is killing all the other algorithms, but for a LSTM this kind of shape is required.