PAIR-code / facets

Visualizations for machine learning datasets
https://pair-code.github.io/facets/
Apache License 2.0
7.35k stars 888 forks source link

Beginner questions: any security risks and further info about facets? #102

Open heoa opened 6 years ago

heoa commented 6 years ago

Related on this here, I cannot understand how it is sufficient to have a single HTML file in /Users/hhh/anaconda3/share/jupyter/nbextensions/facets-dist in order for Google Facets to work:

  1. Is there any security risks in using the Facets Overview or Facets Pivot in Jupyter? Or some of its other services?

Google's earlier product had facets and it was OpenRefine (earlier Google Refine) for refining messy data. Hence, I have a few questions to clarify the scope of Google Facets.

  1. Is the naming just coincidental or does it infer anything on purifying or refining the data for ML projects?

  2. Does there exist something more than just the Facets Overview (summary, explorative reporting) and the Facets Pivot?

  3. Is there anything like in OpenRefine (old Google Refine) to work with the data? Has Facets anything to do with the facets in OpenRefine?

  4. In other words, does Facets provide any way to manipulate/cluster/rename/refine the data like OpenRefine? Or is just for reporting and explorative analysis things especially designed for ML projects?

jameswex commented 6 years ago

The facets html file installed to /Users/hhh/anaconda3/share/jupyter/nbextensions/facets-dist contains all of the html and javascript code of facets plus all its dependencies, compiled into a single html file. That is why just installing that single html file in the right location for jupyter to find is enough to get Facets working in a jupyter notebook.

I'm not aware of any security risks of using Facets in jupyter. The visualization is all done locally using the html and javascript installed as a jupyter extension. What types of concerns would you have?

Facets is not related to OpenRefine. Facets was designed and built from scratch by my team at Google to help visualize and debug machine learning datasets. The name Facets refers to the ability to be able to explore your datasets through different "facets" or "lenses", such as slicing by feature values in Facets Dive.

Facets doesn't provide any way to manipulate the data, it is just for visualizing the data.

I hope this helps clarify things a bit.