capitalone / DataProfiler

What's in your data? Extract schema, statistics and entities from datasets
https://capitalone.github.io/DataProfiler
Apache License 2.0
1.41k stars 157 forks source link

Unhashable type: list when initializing DataLabeler #1140

Open js430 opened 3 months ago

js430 commented 3 months ago

General Information:

Describe the bug: When initializing a datalabeler, I get the following error:

TypeError: unhashable type: 'list'

To Reproduce:

import dataprofiler as dp
from dataprofiler.data_readers.csv_data import CSVData

data=dp.Data("sample_data.csv")

data_labeler = dp.DataLabeler(labeler_type='structured')

Expected behavior:

I would expect no output, just the initialization of it and then afterwards, I could run something like:

predictions = data_labeler.predict(data)

Screenshots: Full error log:

unhashable_type_list

Additional context:

Attached is the sample data file I used, just a list of randomly generated Mac Addresses, IP addresses, and IMSIs sample_data.csv

taylorfturner commented 3 months ago

Hey @js430! Thanks for opening the issue.

I'm unable to replicate this on my end with 0.10.9.

import dataprofiler as dp

data=dp.Data("sample_data.csv")

data_labeler = dp.DataLabeler(labeler_type='structured')

predictions = data_labeler.predict(data)

I'm on sonoma and M1 chip, myself.