animikhroy / rk_toolkit_pipeline_diagrams

Master-repository for all code related to "A Novel Approach to Topological Graph Theory with R-K Diagrams and Gravitational Wave Analysis"
https://arxiv.org/abs/2201.06923
2 stars 0 forks source link

sphinx documentation #15

Closed andorsk closed 1 year ago

andorsk commented 2 years ago

add sphinx documentation.

ghost commented 1 year ago

@animikhroy @andorsk I have added comments to the doc file (https://docs.google.com/document/d/1dB_YSAV-rW3i0Jk1aHKDU5fPjcrS1OZlqdVvUJiGWx0/edit) mentioning my doubts about few terms in it and where to place them in the documentation. I have also commented on where I placed the other segments, so you can check if there is any misplacement. Once I get clarification, I'll add the images and finish up the documentation part from my end.

ghost commented 1 year ago

@animikhroy for the installation and usage gif demos, I am thinking of using the installation written in the Readme.md file in the rktoolkit submodule. But for usage, do you want me to use the module example in the file or the example pipeline mentioned in the sphinx documentation?

animikhroy commented 1 year ago

@Ashxyz998 : 1) Thanks for shifting the comment to this issue 2) Require further clarification not very clear regarding what you mean by : But for usage, do you want me to use the module example in the file or the example pipeline mentioned in the sphinx documentation?

andorsk commented 1 year ago

@Ashxyz998 added you to rk-workbench

ghost commented 1 year ago

@animikhroy I meant if i should use the module usage instructions present in the readme file of the rktoolkit.

Or if i should use this code from the example_pipeline

example_pipeline = RKPipeline(preprocess_nodes=[MinMaxNormalizerNode()],
                                 localization_algorithm=MaxLocalizer(),
                                 hierarchical_embedding_nodes= [
                                    {
                                       "HFeatureExtractor1": HierarchicalFeatureExtractor1()
                                    }
                                 ],
                                 filter_functions=[
                                    {
                                       "HFeatureExtractor1" :
                                       {
                                             'range_measure': StaticFilter(min=.2, max=.8),
                                             'max_measure': StaticFilter(min=0, max=1)
                                       }
                                    }
                                 ], # question: how to define which limits for which measure. Each filter and linkage has to be BY CLUSTER
                                 linkage_function=SimpleLinkage(threshold=.8))
                                 example_pipeline.build()
   example_pipeline.fit(X)
   rk_model = example_pipeline.transform(X)
   rk_models.append(rk_model)

   visualizer = RKModelVisualizer(method="circular")
   visualizer.build(rk_models) # build requires a list of rk_models
   visualizer.show()
andorsk commented 1 year ago

@Ashxyz998 the instructions are out of date here. They should line up with usage in the ligo notebook. That being said, it's basically the same overarching process.

ghost commented 1 year ago

Added documentation in rk_toolkit that I have done so far. More details are mentioned in the pull request and in slack.

andorsk commented 1 year ago

Thanks! Will review.

andorsk commented 1 year ago

@Ashxyz998 is it ready for review or still in progress?

ghost commented 1 year ago

The things remaining to be done in the documentation still: 1.The installation and usage gifs to be added as images to the installation page

  1. Check if all references are being shown properly and debug it
  2. Add code links and notebook links for the topics that are in the Jupyter notebooks. (Would need @andorsk to help me with this)
  3. Deal with the TODOs (most of them were written by @andorsk so i have not modified them as they might be references he needs)
ghost commented 1 year ago

@Ashxyz998 is it ready for review or still in progress?

Yes it should be ready for review apart from the above mentioned points. The rest is done and working.

ghost commented 1 year ago

As commented in another issue, will be checking and removing unnecessary classes.

ghost commented 1 year ago

@andorsk Need to add a reference to Non-Gradient Combinatorial ML Optimiser function. I can't find it in the Rktoolkit, so would need your help regarding this.

ghost commented 1 year ago

@andorsk Also can't find the code for ChoiceOfLens

ghost commented 1 year ago

@andorsk Isometric compressions code should be added to rktoolkit

ghost commented 1 year ago

As commented in another issue, will be checking and removing unnecessary classes.

It's still confusing which ones are unnecessary, so any input on this would be appreciated.

andorsk commented 1 year ago

@Ashxyz998 Thanks for the questions. Here's the answers

  1. The Non-Gradient Combinatorial ML Optimiser is here. It relys on a package called nevergrad, made by facebook.

  2. Choice of lens. This is not a piece of explicit code. It's related to the hierarchy file.. It's about how the onotology's perspective is related to the data and is an implicit concept.

  3. Unnecessary classes is going to be a trudge. Basically, go through the current notebooks, look at what's not being used, and see if it still makes sense to keep in it. There's some redundancy in the graph.py file, for example, and the visualizers require some refacotring, as the core visualizer class ended up being kicked out for the visualizer/util.py, which is supposed to mature into something like the original visualizer class.

andorsk commented 1 year ago

@Ashxyz998 isometric code should exist in the RK Toolkit. You are correct. I don't think it does currently.

ghost commented 1 year ago

Thanks for the response

  1. I'll add the link for this in the following way:

An implementation of the Non-Gradient Combinatorial ML Optimiser is Here(https://github.com/animikhroy/rk_toolkit_pipeline_diagrams/blob/main/02_notebooks/rk_gw_mma/common.py#L63)

would that be fine?

  1. Should I frame the documentation for this like below?:

The Choice of lens is determined using the hierarchy file, and can be changed according to our needs. An example for this is present in the code Here(https://github.com/animikhroy/rk_toolkit_pipeline_diagrams/blob/main/02_notebooks/rk_gw_mma/data/gwtc_heirarchy_pretty.json)

  1. Okay I'll do that and will make a list and inform before removing anything.
animikhroy commented 1 year ago

@Ashxyz998 for 2. choice of lens: refer to the general applications store sales folder because the choice of lens implementation is explained clearly with reference to the store_sales example in the paper

for 1. @andorsk would be able to confirm better.

ghost commented 1 year ago

Updated the link for 2 and committed the documentation.

Only the Isometric Compressions needs code. Apart from that the installation and usage gifs need to be added, which I'll add once I get confirmation on what I should record running as.

andorsk commented 1 year ago

@Ashxyz998 really nice job adding the optimization function in the rk toolkit. I didn't think about adding it before but it makes sense.

I'll get you the isometric compression code. One bit

andorsk commented 1 year ago

This is what I had. We should write some tests for it, but that should be what you need for graph compression.

from collections import defaultdict

class GraphCompressor():

    def transform(self, G):
        return self._compress_unconnected_nodes(G)

    def _compress_unconnected_nodes(self, g):
        ncomp = defaultdict(set)
        for n in g.nodes:
            if g.degree[n] == 1 and len(g.get_children(n)) == 0:
                for a in g.predecessors(n):
                    ncomp[a].add(n)

        arr = []
        for n,v in ncomp.items():
            arr.extend(v)
        gm = GraphMask(nmasks=arr).fit(g)
        for n,v in ncomp.items():
            name = "Disconnected {}".format(n)
            pred = list(gm.predecessors(n))
            col = 'black'
            if len(pred) > 0:
                col = g.nodes[n].get('color', 'black')
            vv = Vertex(name, attributes={"clustered_ref": name, 'color':  col})
            gm.add_vertex(vv)
            gm.add_edge(Edge(n, name))
        return gm
andorsk commented 1 year ago

Link for viewing here: https://ml.kesselmanrao.com/notebooks/store_sales/Round%206-Copy1.ipynb

Usage: compressed_models = [GraphCompressor().transform(m.get()) for m in rkmodels]

ghost commented 1 year ago

Hey @andorsk I have not added this to documentation code as this is still not part of the rktoolkit code. But I have completed the rest of the documentation. Only the usage gif is left which @animikhroy had mentioned is not required as of now.

ghost commented 1 year ago

This issue can be closed after reviewing the documentation and merging it.

animikhroy commented 1 year ago

This issue was closed upon internal review. Will reopen based on any specific requirements from the peer-reviewers.