lux-org / lux

Automatically visualize your pandas dataframe via a single print! 📊 💡
Apache License 2.0
5.15k stars 365 forks source link

ValueError: Field "duration" has type "timedelta64[ns]" which is not supported by Altair. #419

Closed CiaranHaines closed 2 years ago

CiaranHaines commented 3 years ago

It worked fine with 2 columns of datetimes and 2 columns of integers. I used df.apply(f(x)) to create a timedelta column and got the following warning. Full text:

/opt/conda/lib/python3.7/site-packages/IPython/core/formatters.py:918: UserWarning: Unexpected error in rendering Lux widget and recommendations. Falling back to Pandas display. Please report the following issue on Github: https://github.com/lux-org/lux/issues

/opt/conda/lib/python3.7/site-packages/lux/core/frame.py:632: UserWarning:Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/lux/core/frame.py", line 594, in _ipythondisplay self.maintain_recs() File "/opt/conda/lib/python3.7/site-packages/lux/core/frame.py", line 451, in maintain_recs self._widget = rec_df.render_widget() File "/opt/conda/lib/python3.7/site-packages/lux/core/frame.py", line 681, in render_widget widgetJSON = self.to_JSON(self._rec_info, input_current_vis=input_current_vis) File "/opt/conda/lib/python3.7/site-packages/lux/core/frame.py", line 721, in to_JSON recCollection = LuxDataFrame.rec_to_JSON(rec_infolist) File "/opt/conda/lib/python3.7/site-packages/lux/core/frame.py", line 749, in rec_to_JSON chart = vis.to_code(language=lux.config.plotting_backend, prettyOutput=False) File "/opt/conda/lib/python3.7/site-packages/lux/vis/Vis.py", line 334, in to_code return self.to_vegalite(kwargs) File "/opt/conda/lib/python3.7/site-packages/lux/vis/Vis.py", line 310, in to_vegalite self._code = renderer.create_vis(self) File "/opt/conda/lib/python3.7/site-packages/lux/vislib/altair/AltairRenderer.py", line 99, in create_vis chart_dict = chart.chart.to_dict() File "/opt/conda/lib/python3.7/site-packages/altair/vegalite/v4/api.py", line 373, in to_dict dct = super(TopLevelMixin, copy).to_dict(*args, *kwargs) File "/opt/conda/lib/python3.7/site-packages/altair/utils/schemapi.py", line 328, in to_dict context=context, File "/opt/conda/lib/python3.7/site-packages/altair/utils/schemapi.py", line 62, in _todict for k, v in obj.items() File "/opt/conda/lib/python3.7/site-packages/altair/utils/schemapi.py", line 63, in if v is not Undefined File "/opt/conda/lib/python3.7/site-packages/altair/utils/schemapi.py", line 58, in _todict return [_todict(v, validate, context) for v in obj] File "/opt/conda/lib/python3.7/site-packages/altair/utils/schemapi.py", line 58, in return [_todict(v, validate, context) for v in obj] File "/opt/conda/lib/python3.7/site-packages/altair/utils/schemapi.py", line 56, in _todict return obj.to_dict(validate=validate, context=context) File "/opt/conda/lib/python3.7/site-packages/altair/vegalite/v4/api.py", line 363, in to_dict copy.data = _prepare_data(original_data, context) File "/opt/conda/lib/python3.7/site-packages/altair/vegalite/v4/api.py", line 84, in _prepare_data data = _pipe(data, data_transformers.get()) File "/opt/conda/lib/python3.7/site-packages/toolz/functoolz.py", line 627, in pipe data = func(data) File "/opt/conda/lib/python3.7/site-packages/toolz/functoolz.py", line 303, in call return self._partial(args, kwargs) File "/opt/conda/lib/python3.7/site-packages/altair/vegalite/data.py", line 19, in default_data_transformer return curried.pipe(data, limit_rows(max_rows=max_rows), to_values) File "/opt/conda/lib/python3.7/site-packages/toolz/functoolz.py", line 627, in pipe data = func(data) File "/opt/conda/lib/python3.7/site-packages/toolz/functoolz.py", line 303, in call return self._partial(*args, **kwargs) File "/opt/conda/lib/python3.7/site-packages/altair/utils/data.py", line 149, in to_values data = sanitize_dataframe(data) File "/opt/conda/lib/python3.7/site-packages/altair/utils/core.py", line 317, in sanitize_dataframe "".format(col_name=col_name, dtype=dtype) ValueError: Field "duration" has type "timedelta64[ns]" which is not supported by Altair. Please convert to either a timestamp or a numerical value.

CiaranHaines commented 3 years ago

Forgot the context: I was writing this into a notebook on Kaggle.com. See Notebook here, with issue occuring in the penultimate codeblock for the display of toy_times after the assignment to toy_times['duration']

dorisjlee commented 3 years ago

Hi @CiaranHaines,

Thank you for reporting this problem and sending the notebook to help us reproduce the error! It looks like Altair (i.e., the plotting backend that Lux uses) does not support plotting of the time duration data type (timedelta64[ns]). There are two quick ways we can work around the problem for now:

1) You can switch the plotting backend to matplotlib adding the following code before you create the toy_times["duration"] column:

lux.config.plotting_backend="matplotlib"

Screen Shot 2021-09-15 at 10 05 12 AM You can see that the time duration here is plotted as its own bar, which is not very understandable.

2) You can convert the duration column into seconds (as an integer). This way, Lux should be able to visualize the numbers for you by default. We can do this by adding .dt.seconds in your column definition to convert the timedelta resulting from the sum to number of seconds.

toy_times['duration'] = toy_times.apply(lambda x: sum_time(x['start'], x['end']), axis=1).dt.seconds
toy_times

Screen Shot 2021-09-15 at 10 10 28 AM You can see now that there is a distribution of duration times using the second approach. Let us know if this resolves the issue for you!

dorisjlee commented 3 years ago

Note for dev: Improve error message when Altair is unable to handle custom datetime types (e.g., timedelta, interval). Possibly look into making automatic conversions to display visualizations.

CiaranHaines commented 2 years ago

Cheers xxxx

On Wed, 15 Sep 2021, 18:11 Doris Lee, @.***> wrote:

Hi @CiaranHaines https://github.com/CiaranHaines,

Thank you for reporting this problem and sending the notebook to help us reproduce the error! It looks like Altair (i.e., the plotting backend that Lux uses) does not support plotting of the time duration data type (timedelta64[ns]). There are two ways to work around the problem:

  1. You can switch the plotting backend to matplotlib adding the following code before you create the toy_times["duration"] column:

lux.config.plotting_backend="matplotlib"

[image: Screen Shot 2021-09-15 at 10 05 12 AM] https://user-images.githubusercontent.com/5554675/133477891-c9a8398e-3fc4-4e60-909e-178530dc6609.png You can see that the time duration here are plotted as its own bar, which is not very understandable. 2) You can convert the duration column into seconds (as an integer). This way, Lux should be able to visualize the numbers for you by default. We can do this by adding .dt.seconds in your column definition to convert the timedelta resulting from the sum to number of seconds.

toy_times['duration'] = toy_times.apply(lambda x: sum_time(x['start'], x['end']), axis=1).dt.secondstoy_times

[image: Screen Shot 2021-09-15 at 10 10 28 AM] https://user-images.githubusercontent.com/5554675/133478597-c8fceebb-d938-445f-ac27-603d05c0150a.png You can see now that there is a distribution of duration times using the second approach. Let us know if this resolves the issue for you!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lux-org/lux/issues/419#issuecomment-920209918, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVT6ETMVC66Z5LSN7PNMMSDUCDHVHANCNFSM5D77A4RA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.