Closed andorsk closed 1 year ago
@animikhroy @andorsk I have added comments to the doc file (https://docs.google.com/document/d/1dB_YSAV-rW3i0Jk1aHKDU5fPjcrS1OZlqdVvUJiGWx0/edit) mentioning my doubts about few terms in it and where to place them in the documentation. I have also commented on where I placed the other segments, so you can check if there is any misplacement. Once I get clarification, I'll add the images and finish up the documentation part from my end.
@animikhroy for the installation and usage gif demos, I am thinking of using the installation written in the Readme.md file in the rktoolkit submodule. But for usage, do you want me to use the module example in the file or the example pipeline mentioned in the sphinx documentation?
@Ashxyz998 : 1) Thanks for shifting the comment to this issue 2) Require further clarification not very clear regarding what you mean by : But for usage, do you want me to use the module example in the file or the example pipeline mentioned in the sphinx documentation?
@Ashxyz998 added you to rk-workbench
@animikhroy I meant if i should use the module usage instructions present in the readme file of the rktoolkit.
Or if i should use this code from the example_pipeline
example_pipeline = RKPipeline(preprocess_nodes=[MinMaxNormalizerNode()],
localization_algorithm=MaxLocalizer(),
hierarchical_embedding_nodes= [
{
"HFeatureExtractor1": HierarchicalFeatureExtractor1()
}
],
filter_functions=[
{
"HFeatureExtractor1" :
{
'range_measure': StaticFilter(min=.2, max=.8),
'max_measure': StaticFilter(min=0, max=1)
}
}
], # question: how to define which limits for which measure. Each filter and linkage has to be BY CLUSTER
linkage_function=SimpleLinkage(threshold=.8))
example_pipeline.build()
example_pipeline.fit(X)
rk_model = example_pipeline.transform(X)
rk_models.append(rk_model)
visualizer = RKModelVisualizer(method="circular")
visualizer.build(rk_models) # build requires a list of rk_models
visualizer.show()
@Ashxyz998 the instructions are out of date here. They should line up with usage in the ligo notebook. That being said, it's basically the same overarching process.
Added documentation in rk_toolkit that I have done so far. More details are mentioned in the pull request and in slack.
Thanks! Will review.
@Ashxyz998 is it ready for review or still in progress?
The things remaining to be done in the documentation still: 1.The installation and usage gifs to be added as images to the installation page
@Ashxyz998 is it ready for review or still in progress?
Yes it should be ready for review apart from the above mentioned points. The rest is done and working.
As commented in another issue, will be checking and removing unnecessary classes.
@andorsk Need to add a reference to Non-Gradient Combinatorial ML Optimiser function. I can't find it in the Rktoolkit, so would need your help regarding this.
@andorsk Also can't find the code for ChoiceOfLens
@andorsk Isometric compressions code should be added to rktoolkit
As commented in another issue, will be checking and removing unnecessary classes.
It's still confusing which ones are unnecessary, so any input on this would be appreciated.
@Ashxyz998 Thanks for the questions. Here's the answers
The Non-Gradient Combinatorial ML Optimiser is here. It relys on a package called nevergrad, made by facebook.
Choice of lens. This is not a piece of explicit code. It's related to the hierarchy file.. It's about how the onotology's perspective is related to the data and is an implicit concept.
Unnecessary classes is going to be a trudge. Basically, go through the current notebooks, look at what's not being used, and see if it still makes sense to keep in it. There's some redundancy in the graph.py file, for example, and the visualizers require some refacotring, as the core visualizer class ended up being kicked out for the visualizer/util.py, which is supposed to mature into something like the original visualizer class.
@Ashxyz998 isometric code should exist in the RK Toolkit. You are correct. I don't think it does currently.
Thanks for the response
An implementation of the Non-Gradient Combinatorial ML Optimiser is Here(https://github.com/animikhroy/rk_toolkit_pipeline_diagrams/blob/main/02_notebooks/rk_gw_mma/common.py#L63)
would that be fine?
The Choice of lens is determined using the hierarchy file, and can be changed according to our needs. An example for this is present in the code Here(https://github.com/animikhroy/rk_toolkit_pipeline_diagrams/blob/main/02_notebooks/rk_gw_mma/data/gwtc_heirarchy_pretty.json)
@Ashxyz998 for 2. choice of lens: refer to the general applications store sales folder because the choice of lens implementation is explained clearly with reference to the store_sales example in the paper
for 1. @andorsk would be able to confirm better.
Updated the link for 2 and committed the documentation.
Only the Isometric Compressions needs code. Apart from that the installation and usage gifs need to be added, which I'll add once I get confirmation on what I should record running as.
@Ashxyz998 really nice job adding the optimization function in the rk toolkit. I didn't think about adding it before but it makes sense.
I'll get you the isometric compression code. One bit
This is what I had. We should write some tests for it, but that should be what you need for graph compression.
from collections import defaultdict
class GraphCompressor():
def transform(self, G):
return self._compress_unconnected_nodes(G)
def _compress_unconnected_nodes(self, g):
ncomp = defaultdict(set)
for n in g.nodes:
if g.degree[n] == 1 and len(g.get_children(n)) == 0:
for a in g.predecessors(n):
ncomp[a].add(n)
arr = []
for n,v in ncomp.items():
arr.extend(v)
gm = GraphMask(nmasks=arr).fit(g)
for n,v in ncomp.items():
name = "Disconnected {}".format(n)
pred = list(gm.predecessors(n))
col = 'black'
if len(pred) > 0:
col = g.nodes[n].get('color', 'black')
vv = Vertex(name, attributes={"clustered_ref": name, 'color': col})
gm.add_vertex(vv)
gm.add_edge(Edge(n, name))
return gm
Link for viewing here: https://ml.kesselmanrao.com/notebooks/store_sales/Round%206-Copy1.ipynb
Usage: compressed_models = [GraphCompressor().transform(m.get()) for m in rkmodels]
Hey @andorsk I have not added this to documentation code as this is still not part of the rktoolkit code. But I have completed the rest of the documentation. Only the usage gif is left which @animikhroy had mentioned is not required as of now.
This issue can be closed after reviewing the documentation and merging it.
This issue was closed upon internal review. Will reopen based on any specific requirements from the peer-reviewers.
add sphinx documentation.