Closed NevilleS closed 2 years ago
I think the main investigation here, and probably what will determine complexity, is what tools we need to use to get this to work... For instance, a static DAG of all of the categories would be doable in pure python (dask can render out images of DAGs), but the interactive nature of it hugely increases the complexity.
maybe something like this would work? https://pythonhosted.org/dagger/ we could use the "stale" feature to highlight the specific item and highlight the downstream stuff for free
Yeah- it comes down to how much we want to invest in scaffolding out the frontend architecture here and where we see things going. Personally I think we'll find many, many uses for various data visualizations (taxonomies, system maps, datasets, etc.) so having something that makes it easy to build visuals will be key.
Leaving a couple of my notes here after doing some light research on solutions to throw a dataviz layer on our FastAPI backend:
/frontend
folder that's a vue app that builds and outputs to /dist
folderSo basically, if I were to prototype this now I'd do one of two things:
fidesctl/ui/package.json
make frontend
target that builds a /dist
got it, i would consider this a relatively large/risky ticket then due to the amount of unknowns/"firsts" this feature will require.
Do you see this as something that needs to get in before the launch? Trying to get a feel for how we should prioritize this
100% yes it's pretty large & risky. Ideally, if there's a simple way to approximate this (TBH we could have our github pages docs include a good taxonomy visual?) then that'd be a quick way to bridge the gap until we sit down and build a thoughtful UI for all this.
I think we need something before launch to visualize the taxonomy. One option would be to update our docs to point to this separate page: https://clever-ptolemy-e3eb96.netlify.app/, but note that's not quite as good because it only shows the default taxonomies (without any user-specific customizations). We might be able to include that code here, but it's written with a totally different stack and use case (100% d3.js) so I'd hesitate to do that as it'll just be tech debt here.
this might tie into another ticket #91, in that i eventually want to be able to compile docs from source code and have them included in the docs site. That part of the docs wouldn't be able to hot reload probably, but it would solve a lot of problems (for instance, maintaining the Model schema in two places, the docs and the code itself).
So, is a happy balance then having a fidesctl.core.docs
module that can extract entire taxonomy and generate images of it, and then add it to the docs? It can generate a different image for each privacy data type. The technical implementation of it would be to build a graph and then visualize it (Dask does this, so we could probably figure out what library it uses and use that)
A reference with some examples of these kind of visualizations done in python: https://towardsdatascience.com/visualize-hierarchical-data-using-plotly-and-datapane-7e5abe2686e1
Overview
It's really hard to navigate the data categories taxonomy right now when annotating a system or dataset. If you already "know" what's in there, you can probably poke around in the YAML to figure things out, but for any new user they need some kind of aide to understand how it's constructed, search around for the best match, etc.
Requirements
A basic "taxonomy explorer" would solve a couple functions:
This is far from a final set of ideas or designs, but hopefully we can discuss some options for how we might prototype this as I think it'd be really useful to start building some visuals for all this raw metadata we're working with...
Mockup
Quick mockup of this would look like:
The above shows:
key
where the user has typed inuser.provided
so faruser
anduser.provided
based on the user's inputsystem
,account
)Data Visualization Example
As a separate project, we created this prototype visualization here which helps a lot: https://clever-ptolemy-e3eb96.netlify.app/
This is a wholly separate d3.js page though, which might not be a good fit for our stack and is more designed for form instead of function. That said, I found that using this dataviz UI as a visual aide was still wildly better than trying to hunt & peck in the YAML files 👀