animikhroy / rk_toolkit_pipeline_diagrams

Master-repository for all code related to "A Novel Approach to Topological Graph Theory with R-K Diagrams and Gravitational Wave Analysis"
https://arxiv.org/abs/2201.06923
2 stars 0 forks source link

Primary Analysis Python code needs to be added to rk_gw_mma under notebooks #5

Closed animikhroy closed 2 years ago

animikhroy commented 2 years ago

@andorsk the following folder which is vitally important for our PR publication : rk_toolkit_pipeline_diagrams/pruned/02_notebooks/rk_gw_mma/ currently contains: \ data \ helpers.py \ ligo.ipynb

however it needs to be updated as follows as a key component of our PR publication:

\ data \ helpers.py \ ligo_primary_analysis.ipynb \ ligo_secondary_analysis.ipynb

the current file : \ ligo.ipynb only contains the secondary analysis part which will be updated with your results soon. However as discussed today, the primary analysis was already completed earlier on the strain data from gw_openscience and all the work on it needs to be added from andorsk a_novel_approach_toward_tda_paper/notebooks/ into a single .ipynb file titled : \ ligo_primary_analysis.ipynb

andorsk commented 2 years ago

Perfect. Thanks @animikhroy . Nice job raising the issue.

andorsk commented 2 years ago

@animikhroy which file specifically needs to be there. there are a few files:

https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/Pipeline%20v1.ipynb https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/LOSC_Event_tutorial.ipynb https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/Visualizations.ipynb https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/RK%20Model.ipynb

And a few others. Please let me know.

animikhroy commented 2 years ago

@andorsk I did an initial review and could not reach a definite answer for the following reasons:

1) The first link has everything in terms of cleaning up the raw data, searching for the signal, separating the noise, using gaussian and tukey filters and initial clustering of the event. https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/Pipeline%20v1.ipynb

2) The second link is not relevant to our paper as that is a direct ligo tutorial we used as a point of reference to begin our work but has nothing in particular with our novel approach: https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/LOSC_Event_tutorial.ipynb So this can be conclusively discarded.

3) The 3rd link is also relevant because it has all the important stuff regarding merging multiple events and then applying noise reduction filters to them and plotting the eventscape as described in the paper. However, this is also limited and covers only some relevant aspects while missing others https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/Visualizations.ipynb

4) My best guess is that the all the stuff is together in the last notebook and that should contain most if not all of the primary analysis as done in the rk_visualizer. However, this file is too long to render on Github so I downloaded it and tried it on jupyter notebook and google collab but none of them seem to be able to run it for some reason. Hence I am unable to review this to reach a definite conclusion. https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/RK%20Model.ipynb Could you confirm if you are able to run it on your side? If so Id like to find a way to review it in terms of all its modules and corresponding plots and then get back to you.

andorsk commented 2 years ago

@animikhroy you should be able to also see the contents on ml.kesselmanrao.com. Please take a look there.

animikhroy commented 2 years ago

@andorsk Thanks for this I'll check out ml.kesselmanrao.com as soon as I am free today and get back to you.

animikhroy commented 2 years ago

@andorsk I did review the following notebook on ml.kesselmanrao.com : https://github.com/andorsk/a_novel_approach_toward_tda_paper/blob/master/notebooks/RK%20Model.ipynb in details and found that it contains some but not all of the relevant parts with respect to the rk_visualiser, it has the code related to r-k models, the omni-plot of all 15 parameters, along with a clear cut example of how clustering is carried out in the rk_visualiser as demonstrated and explained in our paper. However, it lacks the primary analysis components with respect to the spectrogram plots with noise followed by noise reduction and mapping of events along with the strain data plots in the first 2 steps. It also contains some hard coded components which definitely need to be deleted.

Hence I am unable to find a single .ipynb notebook that contains all the analysis used in each step of the visualizer which is vitally important to present the first part in a separate notebook called ligo_primary_analysis.ipynb for this publication. So here is my proposal: I can piece together each step of the relevant code from different .ipnyb flies on ml.kesselmanrao.com. However, I would like to do a very short 15 min google meet call with you to show you the components and how I aim to go about the process to make sure I do not mess up anything and waste time. This is also because I cannot find the code related to 2 specific plots while the rest is available on different .ipynb files.

I shall wait for your availability to do this quick call tonight/tomorrow at your convenience. Meanwhile I am absolutely focused on reformatting and re-writing parts of the paper based on the feedback and inputs from IUCAA to prepare it for PRX & PRD.

animikhroy commented 2 years ago

@andorsk I am revisiting this as it is vitally important I have not been able to get on a call with you to decide the correct way to go about it yet. So here are the 2 possible options with the limited time at hand:

1) Create an independent .ipynb file named ligo_primary_analysis.ipynb using relevant sections of the following visualiser.py (https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/rk_visualizer.py) without all the code related to the buttons and the UI and need to make sure all the correct plots and visualizations are generating in the correct sequence.

2) Merge the following notebooks together to them create a new ligo_primary_analysis.ipynb file which then requires deletion of all the unmercenary plots and parts which are giving error messages to provide a clean file with all the code and plots in proper sequence.

a) https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/readligo.py b) https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/Pipeline%20v2.ipynb c) https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/Pipeline%20v1.ipynb d) https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/PBH%20LIGO%20Pull.ipynb e) https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/RK%20Model.ipynb

The sequence mentioned above from a) to e) matches the correct steps of the visualizer and our paper so I have verified that for your reference and I can do some clean up and documentation once you merge all of the above files. (if you choose this as the correct way)

Finally, I have created an exclusive file on https://ml.kesselmanrao.com/ called ligo_primary_analysis.ipynb and placed all the required sections in the correct sequence. So the relevant code and plots need to be added their either using option 1 or from option 2 whichever is faster to you @andorsk and saves most time! Here is the new primary analysis notebook for your reference: https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/ligo_primary_analysis.ipynb#2.

andorsk commented 2 years ago

@animikhroy is this something maybe @Ashxyz998 can help with? This is a refactoring issue it sounds like.

animikhroy commented 2 years ago

@andorsk nope actually he doesn't have that much understanding of the code to be able to do this and he has very limited time at hand because of a medical emergency so I am trying to get @Ashxyz998 to finish the rest of his preassigned tasks. This is something only you can do the fastest and in the correct way without breaking things. I tried it initially and it broke. And therefore I requested a call with you. I have sent you WhatsApps regarding the same.

animikhroy commented 2 years ago

@andorsk Plus this is too vital of a component for the peer reviewed evaluation of the paper so it needs to be done in the best way by the best person for the proper acceptance.

andorsk commented 2 years ago

i'll work on this if time frees up however, as of now i don't have time to work on this. This is new work which we discussed I was not going to do.

Since it is a new notebook, there is no worry about breaking things. It sounds like it's just moving things around and running things from the old notebook.

animikhroy commented 2 years ago

@andorsk I had raised this issue earlier and I have specified everything in proper sequence for you to just put the correct code in the respective cells and run it without breaking things. The code may need to come from different notebooks that have already been written in the past. No new code needs to be written for this just need to make sure everything is in order and running correctly with the respective plots in the new jupyter notebook. That's all. You can do this faster than anyone!

andorsk commented 2 years ago

@animikhroy as I mentioned, if I have time I will address this. As of right now, I do not have the time. Please plan accordingly.

andorsk commented 2 years ago

I've been working on this, but there's been a good amount of problems. This is much bigger effort than it might have been thought. Primarily for 3 reasons:

  1. It's old
  2. At the time, there was a lot of iteration and quick actions to get some X visual out for your demo.
  3. We actually never did a proper primary source analysis. We did hacks for demos. And that's probably the biggest issue.

By 3: If you recall, we did some hacky, last second stuff for a demo that was pulled last minute. For example, we generated some gaussians and used that instead of the actual primary data. All this stuff was kinda thrown together last minute for your demo, and we never went back to it. And so things are super messy.

I'm trying to stick what we've done already, not new things, but that could be a problem. If it's just for demonstrative purposes, I don't know it matters.

There's a lot of other issues as well.

Bottom line: The primary analysis is not in a good shape tbh. You can check out the below. I've been working on it since a while ago.

https://ml.kesselmanrao.com/notebooks/a_novel_approach_toward_tda_paper/notebooks/ligo_primary_analysis.ipynb#2

andorsk commented 2 years ago

Some better news: A few of the later stuff flowed in pretty much copy pasta, so it's a little better than before. @animikhroy I'm going to be pretty unavailable the rest of this week, and I'll be in the air/transit on Tuesday for ~24+ hours so please plan accordingly

Take a look at the notebook, see if there's anything you can use.

andorsk commented 2 years ago

Done here: https://github.com/animikhroy/rk_toolkit_pipeline_diagrams/pull/33