Open lmeyerov opened 2 years ago
Hello @lmeyerov 😇, I am interested in contributing to this, can you assign this issue to me? Any tips for where to start with ?
awesome!
Where it fits: it'd probably live in https://github.com/graphistry/pygraphistry/tree/master/graphistry/compute , you can see the pattern there of functional methods that take the graph (def some_method(self, ...): self._edges ...
Testing: run this script for whatever testing: https://github.com/graphistry/pygraphistry/blob/master/docker/test-cpu-local-minimal.sh
Implementation: Starting with a pandas-based impl is probably best, and later we can scale via dask/cudf/etc. the trick is to stick within vectorized pandas operations: https://pythonspeed.com/articles/pandas-vectorization/
Design: Maybe start with something minimal that we can land, and then we can grow the interface from there?
(happy to review PRs as they happen!)
Is your feature request related to a problem? Please describe.
When sharing graphs with others, especially via going from private server / private account -> public hub, such as for publicizing or debugging, it'd help to have a way to quickly anonymize a graph
Sample use cases to make fast:
Perf:
Describe the solution you'd like
Something declarative and configurable like:
Sample transforms:
If there is a popular tabular or graph centric library here that is well-maintained, we should consider using ... but not if it looks like maintenance or security risks
Additional context
Ultimately it'd be good to push this to the UI via some sort of safe mode: role-specific masking, ...