Closed lewtun closed 3 years ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Wow, that was unexpected! 👍 🥇
It's great, thanks! :heart_eyes:
I believe that data.generate_dataset.generate_point_clouds
(now renamed to make_point_clouds
) is also used in vietoris_rips_quickstart.ipynb
. I am looking at the docs and will suggest some changes anyway.
Meanwhile, I would like to draw our attention to the fact that it has not been picked up by the CI.
EDIT: ah, ok - the notebook tests are disabled, so this was expected - my bad!
Meanwhile, I would like to draw our attention to the fact that it has not been picked up by the CI.
EDIT: ah, ok - the notebook tests are disabled, so this was expected - my bad!
You are right, but it's good that you are reminding me to trigger notebooks before merging this ;)
It's greatm thanks! 😍
I believe that
data.generate_dataset.generate_point_clouds
(now renamed tomake_point_clouds
) is also used invietoris_rips_quickstart.ipynb
. I am looking at the docs and will suggest some changes anyway.Meanwhile, I would like to draw our attention to the fact that it has not been picked up by the CI.
EDIT: ah, ok - the notebook tests are disabled, so this was expected - my bad!
Good catch! I did not realise that the quickstart overlaps so strongly with the shape classification example - would it make sense to delete the former in favour of minimising redundancy?
Would it make sense to delete the former in favour of minimising redundancy?
I would not be in favour of this at the moment. The shape classification tutorial assumes a lot from the reader, mathematically speaking. The quickstart wants to be just that, without any frills, just to establish some basics.
Of course, in an ideal world where Lewises abund, the shape classification tutorial would also be improved and then maybe things could be different.
I would not be in favour of this at the moment. The shape classification tutorial assumes a lot from the reader, mathematically speaking. The quickstart wants to be just that, without any frills, just to establish some basics.
OK, sounds good to me.
Of course, in an ideal world where Lewises abund, the shape classification tutorial would also be improved and then maybe things could be different.
Do you mean that I would pick a different set of point clouds to warm the reader up on 😃?
Oh no wait! Sorry @lewtun, everything I have been saying was based on my misunderstanding that you had changed the "2D voids in 2D" notebook, not the shape classification tutorial! So when I said that this tutorials assumes too much from the reader mathematically speaking, after your hard work, that would have come across as quite crass! Sorry!
I take back what I said above and suspend judgement until I read what you actually did here. And ask @wreise what he thinks about your merging proposal.
@lewtun , I modified:
##
level). Otherwise, we could add a##
-level section for the synthetic data and then keep the ##
level for the real dataset. What do you think?
Also, i fixed the bug in vietoris_rips_quickstart
.Otherwise, nothing to say and it's great!
thanks for the changes @wreise - both look good to me! it's a pity about the centering of the images, but i can live with that 😄
View / edit / reply to this conversation on ReviewNB
ulupo commented on 2020-10-19T08:55:30Z ----------------------------------------------------------------
I would consider
"The effect of connecting points as we increase some radius is the creation"
instead of
"The effect of connecting points as we increase some radius ϵ results in the creation"
Furthermore, I would be careful with using "geometric simplicial complex" here. Geometric simplicial complexes (as opposed to abstract simplicial complexes) are typically meant to mean actual subsets of Euclidean space in which k-simplices really live as k-dimensional submanifolds (with corners or whatever). In giotto-tda, we never really compute the PH of geometric complexes build from data because we don't yet support alpha filtrations. Vietoris-Rips etc only ever build abstract (i.e. combinatorial) simplicial complexes for data, which can't always be realised as clean triangulations of the actual point cloud for instance.
Furthermore, there seems to be a little alignment/bullet point issue in the definition of the Betti numbers.
Finally, the note about the meaning of homology should terminate with a full stop instead of a comma.
wreise commented on 2020-10-19T20:07:52Z ----------------------------------------------------------------
I agree with @ulupo about the geometric
. Also, i realized that the definition of complex says that "a complex is a set of $n$ point, so a line"... We would need to add "the convex hull", and maybe talk about the standard "k-simplex".
To avoid both questions, what if we focused on the abstract simplicial complexes only?
View / edit / reply to this conversation on ReviewNB
ulupo commented on 2020-10-19T08:55:31Z ----------------------------------------------------------------
"From the persistence diagrams [...]" appears as a section title, which would seem weird to me. Furthermore, there might be a small issue with bullet points (at least there is one in ReviewNB).
wreise commented on 2020-10-19T19:50:21Z ----------------------------------------------------------------
Having it open in Jupyter notebook, I do not see any issue with the section title nor the bullet points.
View / edit / reply to this conversation on ReviewNB
ulupo commented on 2020-10-19T08:55:32Z ----------------------------------------------------------------
The link to the persistence entropy doc entry seems mangled here.
wreise commented on 2020-10-19T19:58:33Z ----------------------------------------------------------------
Clicking from the notebook or the docs, it works like a charm - it links to the glossary. https://giotto-ai.github.io/gtda-docs/latest/theory/glossary.html#persistence-entropy
Did you experience any particular issue?
View / edit / reply to this conversation on ReviewNB
ulupo commented on 2020-10-19T08:55:33Z ----------------------------------------------------------------
Again, "A more sophisticated feature is [...]" appears as a section title on ReviewNB, could you check?
wreise commented on 2020-10-19T19:58:59Z ----------------------------------------------------------------
All good :)
View / edit / reply to this conversation on ReviewNB
ulupo commented on 2020-10-19T08:55:33Z ----------------------------------------------------------------
It really disturbs me that Python gives precedence to "+" over the star operator. To me this makes the code less readable. Would this not work?
feature_union = make_union( PersistenceEntropy(normalize=True), NumberOfPoints(n_jobs=-1)], *[Amplitude(**metric, n_jobs=-1) for metric in metrics] )
I agree with that.
@lewtun this is amazing work, excellent! The Klein bottle (and projective space) are dead, long live the Klein bottle (and projective space).
Just left a couple of pedantic comments. Having read quite how good a pedagogical job you did, I would now not be in disfavour of merging this notebook with the quickstart. Somehow it would be nice to keep some of the immediacy of the quickstart, but maybe this is hard. In any case, this can be done in a separate PR to keep things easier to follow.
Once you or @wreise address my review I will run the CI with the notebook checks on.
Having it open in Jupyter notebook, I do not see any issue with the section title nor the bullet points.
View entire conversation on ReviewNB
Having it open in Jupyter notebook, I do not see any issue with the section title nor the bullet points.
View entire conversation on ReviewNB
Clicking from the notebook or the docs, it works like a charm - it links to the glossary. https://giotto-ai.github.io/gtda-docs/latest/theory/glossary.html#persistence-entropy
Did you experience any particular issue?
View entire conversation on ReviewNB
I agree with @ulupo about the geometric
. Also, i realized that the definition of complex says that "a complex is a set of $n$ point, so a line"... We would need to add "the convex hull", and maybe talk about the standard "k-simplex".
To avoid both questions, what if we focused on the abstract simplicial complexes only?
View entire conversation on ReviewNB
hey @ulupo and @wreise, i tweaked the text to remove any reference to "geometric simplicial complex", but decided to keep the pictures and description of $k$-simplices because i personally found this useful to understand intuitively what's going on under the hood.
i think wojciech already tackled the other remarks, so please go ahead with the CI test and if everything checks out feel free to clear the outputs and merge!
@lewtun thanks! I'll look into the CI. From @wreise I'd like to know whether things would look good online in the current state. I'm particularly worried about the occasional use of HTML code.
The CI failures in macOS are due to an ongoing screw-up in the way Azure pipelines and brew interact. Hoping they fix it soon...
@ulupo , the html looks good. The only trouble is with the tip - i haven't found a reasonable fix yet.
@ulupo , the html looks good. The only trouble is with the tip - i haven't found a reasonable fix yet.
thanks for checking @wreise! maybe we can just ignore the ">" syntax i used for the tip and just resort to normal text?
thanks for checking @wreise! maybe we can just ignore the ">" syntax i used for the tip and just resort to normal text?
Yes, this works. @ulupo , imo, it's ready to be merge.
Unfortunately the brew issue is quite severe and Azure are not being too proactive at fixing it, so we will have to merge this knowing some pipelines will fail. Additionally, the CI has become too heavy for Azure it seems, we will have to make more of the checks optional from now on.
Unfortunately the brew issue is quite severe and Azure are not being too proactive at fixing it, so we will have to merge this knowing some pipelines will fail. Additionally, the CI has become too heavy for Azure it seems, we will have to make more of the checks optional from now on.
Pardon my ignorance, but is there any reason we couldn't migrate all our CI to GitHub actions? Or is it because we support mac OS / Windows / Linux that we're forced to use Azure?
Pardon my ignorance, but is there any reason we couldn't migrate all our CI to GitHub actions? Or is it because we support mac OS / Windows / Linux that we're forced to use Azure?
I don't know of any reason for using Azure especially, and I am fundamentally ignorant about modern CI practices. One thing we like to use is the manylinux2010 platform (? I don't really understand what it means) for building linux wheels. But I also don't know why we should be migrating to a new system. Perhaps this brew issue is a good reason by itself if not resolved quickly, but otherwise, why bother with the presumably non-trivial job of migrating? What are the core benefits?
Also relevant and includes support for Python 3.9: https://cibuildwheel.readthedocs.io/en/stable/
Perhaps GitHub actions would make the job of deploying docs easier? @wreise in that case there would be yet another reason to migrate.
I'd be happy to look into migrating as a team effort. On my own, I lack the time/motivation at the moment.
But I also don't know why we should be migrating to a new system.
Oh I only suggested this because of the problem with brew on Azure. But you're right, we should do some research to figure out what the alternatives are and decide whether it even makes sense to migrate. I am pretty busy right now, but would be happy to look into this in a few months time.
Reference issues/PRs
Types of changes
Description
Screenshots (if appropriate)
Any other comments?
Checklist
flake8
to check my Python changes.pytest
to check this on Python tests.