capitalone / DataProfiler

What's in your data? Extract schema, statistics and entities from datasets
https://capitalone.github.io/DataProfiler
Apache License 2.0
1.41k stars 157 forks source link

Cannot load DataLabeler due to error in labeler utils. #1126

Open JGSweets opened 5 months ago

JGSweets commented 5 months ago

General Information:

Describe the bug: On line: https://github.com/capitalone/DataProfiler/blob/f8b3e5dbd4b76f0ecc291911ace9e8e21cf1ecb1/dataprofiler/labelers/labeler_utils.py#L360 I receive the error: TypeError: Metric.add_weight() got multiple values for argument 'shape'

Possibly related to: https://stackoverflow.com/questions/62976818/add-weight-got-multiple-values-for-argument-name-while-using-a-custom-attent

Current TF Version: tensorflow==2.16.1

To Reproduce:

import dataprofiler as dp

labeler = dp.DataLabeler.load_from_library("unstructured")

Expected behavior: Loads the labeler

Additional context: I think this has to do with something in TF updating, but the DP not updating with it. Requiring tensorflow==2.15.1 is a current workaround.

JGSweets commented 5 months ago

Not 100% positive, but this might be resolved just by setting name=name looks like keras 3.0.0 changes the metric format for __init__ https://github.com/keras-team/keras/blob/v3.1.1/keras/metrics/metric.py#L9

Also, might need to upgrade the keras version.

taylorfturner commented 5 months ago

Yep, we noticed this last week. Requiring tensorflow==2.15.1 is a current workaround.: yep, temporary fix is the current recommended workaround.

Thanks for documenting in an issue @JGSweets 👍

JGSweets commented 5 months ago

Looks like that might resolve that issue. The new issue being the models all need to be updated to use Version 3 of keras.

ValueError: File format not supported: filepath=.... Keras 3 only supports V3 .keras files and legacy H5 format files (.h5 extension). Note that the legacy SavedModel format is not supported by load_model() in Keras 3.

JGSweets commented 5 months ago

And TF's release notes: https://github.com/tensorflow/tensorflow/releases/tag/v2.16.1

taylorfturner commented 5 months ago

Yeah, I think you are right -- after I saw the errors on the PR checks, looks like the model would need a version update itself

slonweiss commented 4 months ago

Just a note that the current workaround does not work on Windows, as Tensorflow version 2.15.1 is not available.

JGSweets commented 4 months ago

Any update on this? Thanks!

taylorfturner commented 4 months ago

Any update on this? Thanks!

Not yet, @JGSweets. Thanks for the bump though -- haven't forgot about it