tensorflow / datasets

TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
https://www.tensorflow.org/datasets
Apache License 2.0
4.3k stars 1.54k forks source link

Possible duplicates for movie_lens dataset #3373

Open ymodak opened 3 years ago

ymodak commented 3 years ago

Description of issue

Possible duplicates for movie_lens dataset. As per the TF website I see two versions of movie lens dataset namely;

Screen Shot 2021-07-19 at 12 38 23 PM

Is there any difference between these two movie lens datasets? If yes then we may want add the distinction or perhaps remove one version of it. Any help is appreciated. Thanks! @Conchylicultor Can you please take a look? Thanks!

jpgard commented 2 years ago

Is anyone able to provide clarification on this? In particular, what, if any, are the differences between these two datasets?

ymodak commented 2 years ago

@jpgard I see that movie_lens dataset handle is deprecated and it is recommended to use movielens instead. This information is also now raised as Warning when trying to load tensorflow datasets with movie_lens handle.

import tensorflow_datasets as tfds
ds, ds_info = tfds.load('movie_lens', split='train', with_info=True)
==> WARNING:absl:The handle "movie_lens" for the MovieLens dataset is deprecated. Prefer using "movielens" instead.
jpgard commented 2 years ago

Thanks! Got it. So are the underlying datasets identical, and movie_lens is just a deprecated handle for the movielens data?