DeepLabCut / DeepLabCut-core

Headless DeepLabCut (no GUI support)
http://deeplabcut.org
GNU Lesser General Public License v3.0
30 stars 17 forks source link

WIP tf 2.2+ migration #6

Closed MMathisLab closed 3 years ago

MMathisLab commented 4 years ago

I started a branch called TF2.2alpha that is a start at migrating to TF2.

I followed the guide here: https://www.tensorflow.org/guide/migrate and am utilizing pip install tf_ slim also see issue https://github.com/DeepLabCut/DeepLabCut/issues/601

Here is the log of the outstanding issues, some of which are resolved, and some need more work. Zero rush. just did this for a bit fo fun.

short list:

Converted 187 files
Detected 3 issues that require attention
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
File: DeepLabCut-core/build/lib/deeplabcutcore/pose_estimation_tensorflow/train 2.py
--------------------------------------------------------------------------------
DeepLabCut-core/build/lib/deeplabcutcore/pose_estimation_tensorflow/train 2.py:205:12: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.
--------------------------------------------------------------------------------
File: DeepLabCut-core/build/lib/deeplabcutcore/pose_estimation_tensorflow/train.py
--------------------------------------------------------------------------------
DeepLabCut-core/build/lib/deeplabcutcore/pose_estimation_tensorflow/train.py:207:12: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.
--------------------------------------------------------------------------------
File: DeepLabCut-core/deeplabcutcore/pose_estimation_tensorflow/train.py
--------------------------------------------------------------------------------
DeepLabCut-core/deeplabcutcore/pose_estimation_tensorflow/train.py:207:12: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.

report.txt

remaining issues (a like):

(1) testscript is failing due to no video to load into project

altear commented 3 years ago

Heya, what's the status on this? It seems like it's pretty much ready to go

I made a few changes and can run the testscript in tf2.2 (minus tensorpack, which hangs in my ubuntu docker env). Changes: https://github.com/rat-emotion/DeepLabCut-core/commit/f532ebbb7b9c8d2827ad79ba2a565b6d272d52b9

p.s. I'm interested in contributing but haven't touched open source before, if it's useful I'm happy to PR those changes

altear commented 3 years ago

The save_format='h5' option mentioned in the short list warnings seems to refer to Model.save and isn't part of tf.train.Saver so I removed it

MMathisLab commented 3 years ago

Hey! I agree it’s very close! If you have edits that make it fly, please do make a PR :). You can comment on what you changed overall, and we will review it. Thanks for contributing! Much appreciated!!!

MMathisLab commented 3 years ago

few remaining issues after PR #8: see--> https://github.com/DeepLabCut/DeepLabCut-core/pull/8#issuecomment-695217681

(1) warnings w/default loader, but it does train w/default and with imgaug on tensorflow 2.2:

WARNING:tensorflow:From /Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer_v1) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /Users/mwmathis/Documents/DeepLabCut-core/deeplabcutcore/pose_estimation_tensorflow/nnet/losses.py:38: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
Loading ImageNet-pretrained resnet_50
Training parameter:
{'stride': 8.0, 'weigh_part_predictions': False, 'weigh_negatives': False, 'fg_fraction': 0.25, 'weigh_only_present_joints': False, 'mean_pixel': [123.68, 116.779, 103.939], 'shuffle': True, 'snapshot_prefix': '/Users/mwmathis/Documents/DeepLabCut-core/Testcore-Alex-2020-09-19/dlc-models/iteration-0/TestcoreSep19-trainset80shuffle1/train/snapshot', 'log_dir': 'log', 'global_scale': 0.8, 'location_refinement': True, 'locref_stdev': 7.2801, 'locref_loss_weight': 0.05, 'locref_huber_loss': True, 'optimizer': 'sgd', 'intermediate_supervision': False, 'intermediate_supervision_layer': 12, 'regularize': False, 'weight_decay': 0.0001, 'mirror': False, 'crop_pad': 0, 'scoremap_dir': 'test', 'batch_size': 1, 'dataset_type': 'default', 'deterministic': False, 'crop': True, 'cropratio': 0.4, 'minsize': 100, 'leftwidth': 400, 'rightwidth': 400, 'topheight': 400, 'bottomheight': 400, 'all_joints': [[0], [1], [2], [3]], 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'], 'dataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TestcoreSep19/Testcore_Alex80shuffle1.mat', 'display_iters': 2, 'init_weights': '/Users/mwmathis/Documents/DeepLabCut-core/deeplabcutcore/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt', 'max_input_size': 1500, 'metadataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TestcoreSep19/Documentation_data-Testcore_80shuffle1.pickle', 'min_input_size': 64, 'multi_step': [[0.001, 5]], 'net_type': 'resnet_50', 'num_joints': 4, 'pos_dist_thresh': 17, 'project_path': '/Users/mwmathis/Documents/DeepLabCut-core/Testcore-Alex-2020-09-19', 'save_iters': 5, 'scale_jitter_lo': 0.5, 'scale_jitter_up': 1.25, 'output_stride': 16, 'deconvolutionstride': 2}
Starting training....
iteration: 2 loss: 1.1634 lr: 0.001
iteration: 4 loss: 0.5873 lr: 0.001

... 

Starting with imgaug pose-dataset loader.
Batch Size is 1
Initializing ResNet
Loading ImageNet-pretrained resnet_50
Training parameter:
{'stride': 8.0, 'weigh_part_predictions': False, 'weigh_negatives': False, 'fg_fraction': 0.25, 'weigh_only_present_joints': False, 'mean_pixel': [123.68, 116.779, 103.939], 'shuffle': True, 'snapshot_prefix': '/Users/mwmathis/Documents/DeepLabCut-core/Testcore-Alex-2020-09-19/dlc-models/iteration-1/TestcoreSep19-trainset80shuffle1/train/snapshot', 'log_dir': 'log', 'global_scale': 0.8, 'location_refinement': True, 'locref_stdev': 7.2801, 'locref_loss_weight': 0.05, 'locref_huber_loss': True, 'optimizer': 'sgd', 'intermediate_supervision': False, 'intermediate_supervision_layer': 12, 'regularize': False, 'weight_decay': 0.0001, 'mirror': False, 'crop_pad': 0, 'scoremap_dir': 'test', 'batch_size': 1, 'dataset_type': 'imgaug', 'deterministic': False, 'crop': True, 'cropratio': 0.4, 'minsize': 100, 'leftwidth': 400, 'rightwidth': 400, 'topheight': 400, 'bottomheight': 400, 'all_joints': [[0], [1], [2], [3]], 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'], 'dataset': 'training-datasets/iteration-1/UnaugmentedDataSet_TestcoreSep19/Testcore_Alex80shuffle1.mat', 'display_iters': 1, 'init_weights': '/Users/mwmathis/Documents/DeepLabCut-core/deeplabcutcore/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt', 'max_input_size': 1500, 'metadataset': 'training-datasets/iteration-1/UnaugmentedDataSet_TestcoreSep19/Documentation_data-Testcore_80shuffle1.pickle', 'min_input_size': 64, 'multi_step': [[0.001, 5]], 'net_type': 'resnet_50', 'num_joints': 4, 'pos_dist_thresh': 17, 'project_path': '/Users/mwmathis/Documents/DeepLabCut-core/Testcore-Alex-2020-09-19', 'save_iters': 5, 'scale_jitter_lo': 0.5, 'scale_jitter_up': 1.25, 'output_stride': 16, 'deconvolutionstride': 2, 'num_outputs': 1, 'Task': None, 'scorer': None, 'date': None, 'video_sets': None, 'bodyparts': None, 'start': None, 'stop': None, 'numframes2pick': None, 'skeleton': [], 'skeleton_color': 'black', 'pcutoff': None, 'dotsize': None, 'alphavalue': None, 'colormap': None, 'TrainingFraction': None, 'iteration': None, 'resnet': None, 'snapshotindex': None, 'cropping': None, 'x1': None, 'x2': None, 'y1': None, 'y2': None, 'corner2move2': None, 'move2corner': None}
Starting training....
iteration: 1 loss: 1.5909 lr: 0.001
iteration: 2 loss: 0.7031 lr: 0.001
iteration: 3 loss: 0.5709 lr: 0.001
iteration: 4 loss: 0.5077 lr: 0.001
iteration: 5 loss: 0.3996 lr: 0.001

...

ALL DONE!!! - default cases without Tensorpack loader are functional.

(2)

MMathisLab commented 3 years ago

Currently default train, evaluate cases are functional with TensorFlow 2.2 and 2.3 with only the following warnings:

MMathisLab commented 3 years ago

outstanding issue that would need resolved before merging:


Initializing ResNet
WARNING:tensorflow:From /Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/tools/freeze_graph.py:127: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
  File "testscript_cli.py", line 162, in <module>
    dlc.export_model(path_config_file,shuffle=1,make_tar=False)
  File "/Users/mwmathis/Documents/DeepLabCut-core/deeplabcutcore/pose_estimation_tensorflow/export.py", line 327, in export_model
    tf_to_pb(sess, ckpt, output, output_dir=full_export_dir)
  File "/Users/mwmathis/Documents/DeepLabCut-core/deeplabcutcore/pose_estimation_tensorflow/export.py", line 214, in tf_to_pb
    initializer_nodes='')
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/tools/freeze_graph.py", line 361, in freeze_graph
    checkpoint_version=checkpoint_version)
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/tools/freeze_graph.py", line 190, in freeze_graph_with_def_protos
    var_list=var_list, write_version=checkpoint_version)
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 836, in __init__
    self.build()
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 848, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 886, in _build
    build_restore=build_restore)
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 490, in _build_internal
    names_to_saveables)
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 349, in validate_and_slice_inputs
    for converted_saveable_object in saveable_objects_for_op(op, name):
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 210, in saveable_objects_for_op
    variable, "", name)
  File "/Users/mwmathis/opt/anaconda3/envs/DLC-CPU/lib/python3.7/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 84, in __init__
    self.handle_op = var.op.inputs[0]
IndexError: tuple index out of range
altear commented 3 years ago

Awesome! The export_model issues seems to be related to freeze_graph which a lot of people have been having troubles with in tf2

Relevant: https://leimao.github.io/blog/Save-Load-Inference-From-TF2-Frozen-Graph/

MMathisLab commented 3 years ago

awe, that's too bad! It's a rather core feature of DLC now ... I don't have more time today, but I'll look into it (or please feel free! :) Thanks again!)

AlexEMG commented 3 years ago

This seems to be a great resource: https://github.com/leimao/Frozen_Graph_TensorFlow/tree/master/TensorFlow_v2

@gkane26 do you want to have a look at the export?

gkane26 commented 3 years ago

I took a quick stab at the model export with tf 2.3 on my macbook -- it took some minimal changes -- the freeze_graph function was throwing an error that I didn't see a way around (see https://github.com/tensorflow/tensorflow/issues/24591). What did work was using the convert_variables_to_constants function (see the question here https://stackoverflow.com/questions/55299995/exporting-a-frozen-graph-pb-file-in-tensorflow-2). In addition to that change, there were some package imports that needed to be changed (tf_slim which seems to be required for mobilenets is now its own package), but otherwise things look good!

MMathisLab commented 3 years ago

@gkane26 - could you make the required changes here; https://github.com/DeepLabCut/DeepLabCut-core/pull/9

altear commented 3 years ago

Would there be any interest in a tf2 keras model + loader?

I had some trouble extending the current tf.compat.v1 version, so I made a tf2/keras implementation and wrote a function to load weights from pose-tensorflow/deeplabcut models

MMathisLab commented 3 years ago

Hi @altear is there something in PR #9 that does not work for you currently? But happy to discuss another loader.