rivetTDA / rivet

RIVET is a tool for Topological Data Analysis, in particular two-parameter persistent homology.
GNU General Public License v3.0
73 stars 24 forks source link

Many improvements to RIVET #115

Closed mlesnick closed 6 years ago

mlesnick commented 6 years ago

This pull request encompasses my work to the code, as well as Roy Zhao's work and Simon Segert's work.

Here is a summary:

-Roy's work (and my edits to it) introduce the ability to handle multicritical filtrations, including the construction of the degree-Rips bifiltration.

-To improve the structure of the code, Roy introduced the BifiltrationData and FIRep classes. He got rid of the simplex tree representation of a simplicial complex, replacing it with a more straightforward list-of-simplices representations of a simplicial complex in BifiltrationData. Note that this stores only simplices in dimension j-1, j, and j+1, where j is the homology dimension of interest.

-Roy's work also introduces the firep input file type, which allows for presentations and arbitrary boundary matrices (e.g., cellular boundary matrices) as input.

-Roy's work changes the input format for the "bifiltration" input file. In particular, files of this type now need to include simplices of lower dimension, not just maximal simplices.

-My work replaces the linked list column data structures that used to underly RIVET with (optimized versions of) PHAT's lazy heaps.

-I added the ability to compute Betti numbers via computation of a minimal presentation of a 2-D persistence module. This replaces the earlier Koszul Homology algorithm for computing Betti numbers as the default, though the older algorithm is still available in the rivet_console via the --koszul flag. The performance of the two algorithms is comparable for computing Betti numbers, but the new approach makes barcode template computation much, much faster, since a minimal presentation (which is usually quite small) can now be given as input to the barcode template computations. The speed of minimal presentation / Betti number computation will continue to improve significantly in the future as we do more optimizations.

-The minimal presentation can be printed via the --minpres flag.

-I have fixed issue #76, which was the last remaining major bug.

-Simon made improvements to the interface which allow us to adjust the bounds of the viewable window. This allows us to compare two different RIVET visualizations on the same scale, and to zoom in on the interesting parts of the output.

-Simon also added the ability to handle descending coordinates gracefully in RIVET. One can now specify in an input file that a coordinate direction is descending by adding "[-]" in front of the axis label, e.g. "[-]" density. The prefix "[+]" to specify an ascending coordinate direction is optional.

-I have updated the data folder in RIVET to include more examples, including lots of small toy examples that I have accumulated over time.

Note that the RIVET documentation is only partially caught up with these improvements and extensions, and in any case ought to be consolidated into a single file, instead of multiple web pages. We will work on this in the next weeks.

There is still more to do in many of areas where this pull request provides improvements, but I believe that we are currently at a nice local optimum, and that this is a good time to incorporate these improvements into RIVET.

xoltar commented 6 years ago

@SimonSegert @mlesnick I've merged master into this pull request. It builds and seems to run OK, but please test to ensure I didn't mess anything up in the merge. One thing I noted was the labels for x and y reversed are still present on the data selection dialog and say "TextLabel".

SimonSegert commented 6 years ago

@xoltar regarding the labels, lines 206-276 inclusive in dataselectdialog.ui should not be there, as in Mike's merge commit from yesterday

mlwright84 commented 6 years ago

Hi everyone -- sorry for my delay in looking at this pull request! This code is giving me runtime errors. Specifically, rivet_console crashes with a std::runtime_error and the message "Unsupported output format: R1". Has anyone else experienced this?

xoltar commented 6 years ago

Hi Matthew, sounds like perhaps the console and UI are out of sync? Perhaps you built the console but are still running an old version of the viewer?

On Thu, May 31, 2018 at 2:48 PM, Matthew Wright notifications@github.com wrote:

Hi everyone -- sorry for my delay in looking at this pull request! This code is giving me runtime errors. Specifically, rivet_console crashes with a std::runtime_error and the message "Unsupported output format: R1". Has anyone else experienced this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393691662, or mute the thread https://github.com/notifications/unsubscribe-auth/ABK13La3kDHckv-7bHJjkXC-3seVbq0Iks5t4GU0gaJpZM4T6CyL .

mlwright84 commented 6 years ago

Bryn, I also thought it might be something like that. However, I just cloned from mlesnick/master into a new folder and compiled again, and I'm still getting the same error. I'll look into this and try to figure out what's going on. My system is running Ubuntu 18.04, cmake 3.10.2, g++ 7.3.0-16ubuntu3, Boost 1.65.1.

xoltar commented 6 years ago

Ah, in that case I probably clobbered my change to pass the format as messagepack, so it's still passing R1 instead. Not sure why it worked for me in that case...

Unless you're getting this error when trying to load a precomputed file? In that case it would just be that we don't support boost serialization anymore.

On Thu, May 31, 2018 at 3:20 PM, Matthew Wright notifications@github.com wrote:

Bryn, I also thought it might be something like that. However, I just cloned from mlesnick/master into a new folder and compiled again, and I'm still getting the same error. I'll look into this and try to figure out what's going on. My system is running Ubuntu 18.04, cmake 3.10.2, g++ 7.3.0-16ubuntu3, Boost 1.65.1.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393699441, or mute the thread https://github.com/notifications/unsubscribe-auth/ABK13EaOVuZC31hfFRltsQyTFEKuglnEks5t4GyqgaJpZM4T6CyL .

mlesnick commented 6 years ago

I am also finding that Bryn's merge crashes, for example with 240 points. I am also getting an std::runtime_error crash, but did not see the "Unsupported output format: R1" message. This is on my MacBook.

I am not loading a precomputed file when I get the std::runtime_error.

I didn't see any crashes before Bryn's merge today.

On Thu, May 31, 2018 at 7:03 PM Bryn Keller notifications@github.com wrote:

Ah, in that case I probably clobbered my change to pass the format as messagepack, so it's still passing R1 instead. Not sure why it worked for me in that case...

Unless you're getting this error when trying to load a precomputed file? In that case it would just be that we don't support boost serialization anymore.

On Thu, May 31, 2018 at 3:20 PM, Matthew Wright notifications@github.com wrote:

Bryn, I also thought it might be something like that. However, I just cloned from mlesnick/master into a new folder and compiled again, and I'm still getting the same error. I'll look into this and try to figure out what's going on. My system is running Ubuntu 18.04, cmake 3.10.2, g++ 7.3.0-16ubuntu3, Boost 1.65.1.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393699441, or mute the thread < https://github.com/notifications/unsubscribe-auth/ABK13EaOVuZC31hfFRltsQyTFEKuglnEks5t4GyqgaJpZM4T6CyL

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393709546, or mute the thread https://github.com/notifications/unsubscribe-auth/AL-lBCa-EujXnR3KuQEfnUQIUUhsH__Xks5t4HbAgaJpZM4T6CyL .

mlwright84 commented 6 years ago

I am also not loading any precomputed file. I have experienced this error with points cloud, bifiltration, and FI rep input files.

The error is a std::runtime_error, but it comes with the message "Unsupported output format: R1". The error is being thrown by line 468 of console.cpp, because params.outputFormat isn't recognized. I didn't use the -f flag when I called rivet_console, and the help screen says the format defaults to "R1". Perhaps this default simply needs to be changed?

Unfortunately, if I call rivet_console using the flag -f msgpack, I get a segmentation fault when the output file is being written.

Bryn, do you have any insight on what is happening here?

Thanks, Matthew

On Thu, May 31, 2018 at 6:25 PM mlesnick notifications@github.com wrote:

And to be clear, I am not loading any precomputed file when I get the std::runtime_error.

On Thu, May 31, 2018 at 7:23 PM Michael Lesnick mlesnick@princeton.edu wrote:

I am also finding that Bryn's merge crashes, for example with 240 points. However, unlike Matthew, I am not getting a "Unsupported output format" message. Instead, I am getting an std::runtime_error crash. This is on my MacBook.

I didn't see any crashes before Bryn's merge today.

On Thu, May 31, 2018 at 7:03 PM Bryn Keller notifications@github.com wrote:

Ah, in that case I probably clobbered my change to pass the format as messagepack, so it's still passing R1 instead. Not sure why it worked for me in that case...

Unless you're getting this error when trying to load a precomputed file? In that case it would just be that we don't support boost serialization anymore.

On Thu, May 31, 2018 at 3:20 PM, Matthew Wright < notifications@github.com

wrote:

Bryn, I also thought it might be something like that. However, I just cloned from mlesnick/master into a new folder and compiled again, and I'm still getting the same error. I'll look into this and try to figure out what's going on. My system is running Ubuntu 18.04, cmake 3.10.2, g++ 7.3.0-16ubuntu3, Boost 1.65.1.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393699441, or mute the thread <

https://github.com/notifications/unsubscribe-auth/ABK13EaOVuZC31hfFRltsQyTFEKuglnEks5t4GyqgaJpZM4T6CyL

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393709546, or mute the thread < https://github.com/notifications/unsubscribe-auth/AL-lBCa-EujXnR3KuQEfnUQIUUhsH__Xks5t4HbAgaJpZM4T6CyL

.

— You are receiving this because your review was requested. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393713308, or mute the thread https://github.com/notifications/unsubscribe-auth/AXqxAPeisUyc_mt7mvuHfudO1K5PBGuqks5t4HvcgaJpZM4T6CyL .

xoltar commented 6 years ago

Hi Folks,

I just pushed a few fixes - Matthew's issue was indeed due to a wrong default value in the docstring. I removed the lines Simon suggested, and found a misplaced brace that was causing strange problems. However, there's still something weird going on now where the FIRep that gets passed to MultiBetti is null, which is crashing it. If that sounds familiar to anybody that's great, otherwise I'm tempted to roll back and redo the merge, since adding in the changes I asked Simon for in while I had already partially merged stuff before may have introduced complications. I'll probably redo the merge on Monday unless someone tells me otherwise between now and then.

On Thu, May 31, 2018 at 7:37 PM, Matthew Wright notifications@github.com wrote:

I am also not loading any precomputed file. I have experienced this error with points cloud, bifiltration, and FI rep input files.

The error is a std::runtime_error, but it comes with the message "Unsupported output format: R1". The error is being thrown by line 468 of console.cpp, because params.outputFormat isn't recognized. I didn't use the -f flag when I called rivet_console, and the help screen says the format defaults to "R1". Perhaps this default simply needs to be changed?

Unfortunately, if I call rivet_console using the flag -f msgpack, I get a segmentation fault when the output file is being written.

Bryn, do you have any insight on what is happening here?

Thanks, Matthew

On Thu, May 31, 2018 at 6:25 PM mlesnick notifications@github.com wrote:

And to be clear, I am not loading any precomputed file when I get the std::runtime_error.

On Thu, May 31, 2018 at 7:23 PM Michael Lesnick mlesnick@princeton.edu wrote:

I am also finding that Bryn's merge crashes, for example with 240 points. However, unlike Matthew, I am not getting a "Unsupported output format" message. Instead, I am getting an std::runtime_error crash. This is on my MacBook.

I didn't see any crashes before Bryn's merge today.

On Thu, May 31, 2018 at 7:03 PM Bryn Keller notifications@github.com wrote:

Ah, in that case I probably clobbered my change to pass the format as messagepack, so it's still passing R1 instead. Not sure why it worked for me in that case...

Unless you're getting this error when trying to load a precomputed file? In that case it would just be that we don't support boost serialization anymore.

On Thu, May 31, 2018 at 3:20 PM, Matthew Wright < notifications@github.com

wrote:

Bryn, I also thought it might be something like that. However, I just cloned from mlesnick/master into a new folder and compiled again, and I'm still getting the same error. I'll look into this and try to figure out what's going on. My system is running Ubuntu 18.04, cmake 3.10.2, g++ 7.3.0-16ubuntu3, Boost 1.65.1.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/rivetTDA/rivet/pull/115#issuecomment-393699441 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/ ABK13EaOVuZC31hfFRltsQyTFEKuglnEks5t4GyqgaJpZM4T6CyL

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393709546, or mute the thread < https://github.com/notifications/unsubscribe-auth/AL-lBCa- EujXnR3KuQEfnUQIUUhsH__Xks5t4HbAgaJpZM4T6CyL

.

— You are receiving this because your review was requested. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393713308, or mute the thread https://github.com/notifications/unsubscribe-auth/AXqxAPeisUyc_ mt7mvuHfudO1K5PBGuqks5t4HvcgaJpZM4T6CyL .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-393740942, or mute the thread https://github.com/notifications/unsubscribe-auth/ABK13KiMINfTpY9M7-mOqSiW-BEI8Jcsks5t4Kj_gaJpZM4T6CyL .

mlesnick commented 6 years ago

Thanks Bryn. I've not seen any issue with a null FIRep like the one you mentioned.

xoltar commented 6 years ago

@mlesnick I have a new better merge, can I force-push it here (i.e. your master) to replace the old one? Note it still crashes on that example, but your 7c28e71 commit also crashes on that file for me so I don't think it's related to merging from master.

mlesnick commented 6 years ago

@xoltar, sounds great. thanks. Regarding a crash, you are referring to the 240 points example, yes? Please note that my commit "of record" is ae6b6f2, not 7c28e71. I forgot to stage something in 7c28e71, and IIRC, that something is important. Do you see a crash with ae6b6bf2?

xoltar commented 6 years ago

Hi Mike, if I build a new branch based on 7c28e71 + your ae6b6f2 cherry-picked so as not to include either of the existing commits from where I tried to merge rivet-tda's master in, RIVET still crashes on circle 240 H0 with no coarsening. So I think this problem is not introduced by merging in rivet-tda's rmaster and we should proceed with that. Do you agree?

mlesnick commented 6 years ago

@xoltar thanks for explaining. I'm a bit confused about some of what you wrote: I don't believe that commit ae6b6f2 includes any recent commits from where you tried to merge rivet-tda's master in. I don't think you should have to do any cherry-picking at all to build ae6b6f2.

Also, I'm not sure whether 240 points H0 with no coarsening is a good test case here. My earlier comment here might have given the impression that I was testing with that, but I was not. Rather, I was testing with some heavily coarsened version of 240 points, something like 10x10 or 30x30. (I don't remember whether it was H0 or H1, or both, that was causing a crash.)

I just ran some tests on my laptop (16 GB ram) with commit ae6b6f2. 240 points H0 w/ 500x500 coarsening works fine, but with no coarsening at all on the same example, I run out of memory quickly and start using a lot of swap. I think the line arrangement is just too large. Perhaps you are seeing a crash for the same reason? 240 points H1 w/ no coarsening did complete on my computer, using commit ae6b6f2.

If, aside from the issue with 240 points H0, no coarsening, the merge seems to be working well, then I'd say, sure, go ahead with the force-push. Thanks a lot for working on this!

xoltar commented 6 years ago

@mlesnick @SimonSegert I found something unexpected. When I compare the circle 240 sample file in master vs in this branch (or in Mike's latest before merging master into this PR, now known as 3f78b3e due to history rewriting), I get something very different. I'm not sure if this is because the data file has changed and I shouldn't expect the results to look the same, or if there's something else going on. Both are H0, 10x10 coarsening. Any ideas?

From master: circle_data_240pts_inv_density txt h0_10_10 line_offset_-0 199553_angle_3 814

From this PR: circle_data_240pts_codensity txt h0_10_10 line_offset_-0 746484_angle_14 4082

mlesnick commented 6 years ago

I think this is because in my fork I changed the distance cutoff to something larger than the max---with 10x10 coarsening, you miss a lot of detail with the larger cutoff.

On Tue, Jun 19, 2018 at 1:01 AM Bryn Keller notifications@github.com wrote:

@mlesnick https://github.com/mlesnick @SimonSegert https://github.com/SimonSegert I found something unexpected. When I compare the circle 240 sample file in master vs in this branch (or in Mike's latest before merging master into this PR, now known as 3f78b3e https://github.com/rivetTDA/rivet/commit/3f78b3e01fe329295ee5918b4d7c8cc66c4f9cfa due to history rewriting), I get something very different. I'm not sure if this is because the data file has changed and I shouldn't expect the results to look the same, or if there's something else going on. Both are H0, 10x10 coarsening. Any ideas?

From master: [image: circle_data_240pts_inv_density txt h0_10_10 lineoffset-0 199553_angle_3 814] https://user-images.githubusercontent.com/1226204/41577634-d900339e-7342-11e8-8951-fdb721077d11.png

From this PR: [image: circle_data_240pts_codensity txt h0_10_10 lineoffset-0 746484_angle_14 4082] https://user-images.githubusercontent.com/1226204/41577641-e6658bd8-7342-11e8-9385-e0a89b6d5df0.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-398274644, or mute the thread https://github.com/notifications/unsubscribe-auth/AL-lBKWuZZIQvu9ze6uEh9NKeGLHaoroks5t-IWJgaJpZM4T6CyL .

mlwright84 commented 6 years ago

I just pushed two commits that modify VisualizationWindow. Most notably, I moved two text fields (that display the homology dimension and input file name) to below the control elements. This reduces the minimum width of the window, to allow it to fit on screens as narrow as 1000 pixels or so. (Previously, its minimum width was greater than 1280 pixels, the width of a screen I was testing it on.) I also removed the status bar, since we weren't really using it.

mlwright84 commented 6 years ago

This pull request looks good to me. I've tested RIVET on several data sets, and it works well. I like the improvements to the interface. Also, rivet_console is much faster than before -- e.g. I did a H_0 calculation on 3600 points in 90 seconds, rather than a few hours in the old RIVET.

mlesnick commented 6 years ago

@xoltar @mlwright84 thanks for testing and for the input. I will take care of the changes Bryn requested.

@SimonSegert is now finishing a fix to issue #120 that I would like to incorporate before merging.

Also, I discovered a bug with the visualization in that arises in the current branch when "fit to window" is deselected. This occurs on my MacBook, but neither on Simon's machine nor on my Linux machine. I'm sure we could track this bug down, but I wonder if it would be better to just remove "fit to window" altogether; this feature has never really been of any use to me.

If we do this, I suppose it would suffice to move "show barcode" to halfway between where it is now and where "fit to window" is. Thoughts?

After we resolve these issues, we should be ready to merge.

mlwright84 commented 6 years ago

I'm sorry to hear about the bug involving "fit to window." I think the "fit to window" checkbox is worth preserving in RIVET -- my students and I have occasionally selected and deselected it in our use of RIVET. I think it's OK to merge the pull request, note this "fit to window" issue as a bug, and then fix it later.

mlesnick commented 6 years ago

Ok--let's keep the fit to window feature then. And I agree, the fit to window bug need not hold up the merge.

mlesnick commented 6 years ago

I did some cosmetic work on the code to address the style issues Bryn mentioned. There still exist some short variable names in the code, though I changed many of these. I suggest we take care of the others later. Most notably, some short member names that Bryn would not like are used in the structs defined in bifiltration_data.h. But I have some changes in progress to bifiltration_data.h to make the code more memory efficient. We can clean up the variable names as part of those changes.

The only two issues that remain on the table are the minor bugs in the visualization mentioned above, (apparently) related to Simon's work. But it doesn't seem to make sense to hold up the PR for these.

xoltar commented 6 years ago

Agreed, please feel free to merge when you're ready!

On Fri, Jun 22, 2018 at 2:30 PM mlesnick notifications@github.com wrote:

I did some cosmetic work on the code to address the style issues Bryn mentioned. There still exist some short variable names in the code, though I changed many of these. I suggest we take care of the others later. Most notably, some short member names that Bryn would not like are used in the structs defined in bifiltration_data.h. But I have some changes in progress to bifiltration_data.h to make the code more memory efficient. We can clean up the variable names as part of those changes.

The only two issues that remain on the table are the minor bugs in the visualization mentioned above, (apparently) related to Simon's work. But it doesn't seem to make sense to hold up the PR for these.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rivetTDA/rivet/pull/115#issuecomment-399587920, or mute the thread https://github.com/notifications/unsubscribe-auth/ABK13EZ6Hamz7aZggn8h0Edt2XMfEEvQks5t_WH9gaJpZM4T6CyL .