huggingface / dataset-viewer

Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.
https://huggingface.co/docs/dataset-viewer
Apache License 2.0
689 stars 76 forks source link

Reduce the frequency of / allow to disable notifications for parquet conversions? #2349

Open severo opened 8 months ago

severo commented 8 months ago

Some users are annoyed by the discussions opened by "parquet-convert" bot.

MNeMoNiCuZ commented 8 months ago

Yeah, I was gonna unsubscribe to the whole user because of these notification. In my mind, automated systems shouldn't be at this verbosity level.

pixelass commented 8 months ago

As already noted on Huggingface

we have no idea what parquet-converter is. It seems to be something from apache.org but we are not sure. They have no info on their page except a privacy policy link that leads to Google???. To me this sounds very fishy and we don't see any benefit from this user except for spamming all datasets. We excpect a bit more transparency of AI companies or communities. There are too many bad people out there and we have very high ethical and transparency standards.

So it's really about the user. If it is a legit account then they should offer a bit information about who they are and what their intentions are.

severo commented 8 months ago

OK. Another issue here is that the bot is not well-identified as an official Hugging Face Hub feature. Possibly the message + the avatar + the user description could be improved.

severo commented 8 months ago

First step: I expanded the title and descriptions to say it's an official Hugging Face bot.

Capture d’écran 2024-01-30 à 10 04 13
pixelass commented 8 months ago

Please don't get me wrong. Thank you for looking into this and taking action. That being said.

https://parquet.apache.org/ image

I am talking about this link. Why does it link to google's privacy policy? (aside from the fact that is not visually acccesible, barely visible even to people with no visual impairment).

I'm sorry, but adding Hugging Face BOT does not build any trust. Isn't that what every scammer does? (Hey hello we are ..., please give me your data).

I thought that Hugging Face values ethics and privacy. Whatever parquet-converter currently represents is definitely not what one would expect for those values. I stand by my statement. I do not trust a service that is not transparent about thir use of data, ownership and intentions. If you want to play on the global market, please play by the rules and invest into GDPR, otherwise please stop spamming all datasets.

severo commented 8 months ago

re: https://parquet.apache.org/ it's an external website. We link to it because it's the reference for the Parquet format.

re title change: indeed, it's not a guarantee, but at least the bot claims its affiliation. I'll see if we can add an official badge to help give trust

re disabling the notifications: I forgot to mention it, but you can set viewer: false in the README header to disable the viewer and the parquet conversion. See https://huggingface.co/docs/hub/datasets-viewer-configure#disable-the-viewer.

pixelass commented 8 months ago

Yeah, I mean do your thing. I don't much care.

Thanks for the tip on viewer: false. Very helpful

TimPietrusky commented 8 months ago

@severo I think it would also be nice if the bot would actually be a member of the hf organisation itself.

severo commented 8 months ago

The ability to block the bot must be implemented in the Hub. See https://github.com/huggingface/moon-landing/issues/7807 (internal). Keeping this issue opened for visibility.