henrysky / astroNN

Deep Learning for Astronomers with Keras
http://astronn.readthedocs.io/
MIT License
193 stars 51 forks source link

Have a bug when reproduce "demo_tutorial/galaxy10/Galaxy10_Tutorial.ipynb" #27

Open Junjie-Jin opened 5 months ago

Junjie-Jin commented 5 months ago

System information

Describe the problem

have the problem when train the nerual net, the error is :

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.


TypeError Traceback (most recent call last) Cell In[12], line 3 1 # To train the nerual net 2 # astroNN will normalize the data by default ----> 3 galaxy10net.train(train_images, train_labels)

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/shared/warnings.py:55, in deprecated_copy_signature..deco..tgt(*args, kwargs) 49 warnings.warn( 50 f"Call to function {target.name}() is deprecated and will be removed in " 51 + f"future. Use {signature_source.name}() instead.", 52 stacklevel=2, 53 ) 54 inspect.signature(signature_source).bind(*args, *kwargs) ---> 55 return target(args, kwargs)

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:702, in CNNBase.train(self, *args, kwargs) 700 @deprecated_copy_signature(fit) 701 def train(self, *args, *kwargs): --> 702 return self.fit(args, kwargs)

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:394, in CNNBase.fit(self, input_data, labels, sample_weight) 380 """ 381 Train a Convolutional neural network 382 (...) 391 :History: 2017-Dec-06 - Written - Henry Leung (University of Toronto) 392 """ 393 # Call the checklist to create astroNN folder and save parameters --> 394 self.pre_training_checklist_child(input_data, labels, sample_weight) 396 reduce_lr = ReduceLROnPlateau( 397 monitor="val_loss", 398 factor=0.5, (...) 403 verbose=self.verbose, 404 ) 406 early_stopping = EarlyStopping( 407 monitor="val_loss", 408 min_delta=self.early_stopping_min_delta, (...) 411 mode="min", 412 )

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:319, in CNNBase.pre_training_checklist_child(self, input_data, labels, sample_weight) 315 norm_labels = self.labels_normalizer.normalize(labels, calc=False) 316 if ( 317 self.keras_model is None 318 ): # only compile if there is no keras_model, e.g. fine-tuning does not required --> 319 self.compile() 321 norm_data = self._tensor_dict_sanitize(norm_data, self.keras_model.input_names) 322 norm_labels = self._tensor_dict_sanitize( 323 norm_labels, self.keras_model.output_names 324 )

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/astroNN/models/base_cnn.py:235, in CNNBase.compile(self, optimizer, loss, metrics, weighted_metrics, loss_weights, sample_weight_mode) 229 raise RuntimeError( 230 'Only "regression", "classification" and "binary_classification" are supported' 231 ) 233 self.keras_model = self.model() --> 235 self.keras_model.compile( 236 loss=loss_func, 237 optimizer=self.optimizer, 238 metrics=self.metrics, 239 weighted_metrics=weighted_metrics, 240 loss_weights=loss_weights, 241 sample_weight_mode=sample_weight_mode, 242 ) 244 # inject custom training step if needed 245 try:

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py:122, in filter_traceback..error_handler(*args, **kwargs) 119 filtered_tb = _process_traceback_frames(e.traceback) 120 # To get the full stack trace, call: 121 # keras.config.disable_traceback_filtering() --> 122 raise e.with_traceback(filtered_tb) from None 123 finally: 124 del filtered_tb

File ~/miniforge3/envs/py3.9/lib/python3.9/site-packages/keras/src/utils/tracking.py:26, in no_automatic_dependency_tracking..wrapper(*args, kwargs) 23 @wraps(fn) 24 def wrapper(*args, *kwargs): 25 with DotNotTrackScope(): ---> 26 return fn(args, kwargs)

TypeError: compile() got an unexpected keyword argument 'sample_weight_mode'

Suggestion

Optional, if you have any idea how to fix the issue

Junjie-Jin commented 5 months ago

Maybe the command "pip list" will help, it can show which versions of packages are required.

Junjie-Jin commented 5 months ago

This is a software compatibility issue. The problem with MAC M1 can be solved by the following installation:

conda create -n tensorflow-gpu python=3.8 conda activate tensorflow-gpu python -m pip install -U pip python -m pip install tensorflow-macos==2.12.0 python -m pip install tensorflow-metal pip install scikit-learn pip install tensorFlow-probability==0.19.0

to check whether the installation is ok: import sys import tensorflow.keras import tensorflow as tf import platform print(f"Python Platform: {platform.platform()}") print(f"Tensor Flow Version: {tf.version}") print(f"Keras Version: {tensorflow.keras.version}") print() print(f"Python {sys.version}") gpu = len(tf.config.list_physical_devices('GPU'))>0 print("GPU is", "available" if gpu else "NOT AVAILABLE")

you should get: Python Platform: macOS-12.3-arm64-i386-64bit Tensor Flow Version: 2.12.0 Keras Version: 2.12.0

Python 3.8.19 | packaged by conda-forge | (default, Mar 20 2024, 12:49:57) [Clang 16.0.6 ] GPU is available

henrysky commented 5 months ago

Thanks for the bug report!

Indeed this is an ongoing issue with the latest version of Tensorflow (which is separating Keras out again) and Keras v3. If you want to quickly train a neural network to classify Galaxy10, here is a notebook that fine-tunes ResNet-V2 with Keras v3 with Galaxy10 images loaded with astroNN.

https://drive.google.com/file/d/1GnrsZAPZFTfBrhuQ09zqh4n8x1QYEPOb/view?usp=sharing

Please let me know if the notebook works for you locally (it is unlikely you can run it online with Google Collab as you will get resource exhausted error due to limited compute resources there)

Junjie-Jin commented 5 months ago

Run it locally, it report the error "ModuleNotFoundError: No module named 'keras.dtype_policies'"

henrysky commented 5 months ago

Are you using Keras v3? I think you need to use at least Keras v3 (and maybe at least Tensorflow v1.16) in order to run that notebook.

Junjie-Jin commented 5 months ago

yes,the version of Keras is 3.3.3. I get the error:


ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 11 10 try: ---> 11 from keras.src.dtype_policies.dtype_policy import set_dtype_policy 12 except ImportError:

ModuleNotFoundError: No module named 'keras.src'

During handling of the above exception, another exception occurred:

ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 13 11 from keras.src.dtype_policies.dtype_policy import set_dtype_policy 12 except ImportError: ---> 13 from keras.dtype_policies.dtype_policy import set_dtype_policy 15 pylab_style(paper=True) 16 set_dtype_policy("mixed_float16")

ModuleNotFoundError: No module named 'keras.dtype_policies'

henrysky commented 5 months ago

Aghh I see I should check for ModuleNotFoundError not ImportError. I have fixed the notebook or you can simply change

except ImportError

to

except ModuleNotFoundError