Closed shaypal5 closed 5 years ago
Fantastic!
Thank you so much for the PR and addressing so many issues, this is really great! I'll definitely set aside some time in the near future and review this in detail, but from initial scan, yes, it seems pretty safe and good improvements. :100:
So far as AzurePipelines, I've used it to publish the last few releases, and I'm slightly perplexed as to why it has decided to fall over on itself. I've used Travis before on this project and many others, but wanted to test on all three OS, and also, I don't really approve of what happened in Travis recently on a personal level.
Anyhow, I'll poke around AzurePipelines a bit and see what the deal is, otherwise I'll probably merge this in and go with CircleCI and then release a new version.
Hey Miles,
Thanks for the enthusiastic response. :)
I see you made a quick review. I'll go back and fix things before submitting this for PR again.
And tests are working again, so I'll take a look at the results once they're done.
Great!
Also, I see that the more tricky failure is processing with Rust. If you are interested in attempting to add the calculations there as well, the relevant area is around https://github.com/milesgranger/gap_statistic/blob/ab432b7b55d138f7c109ae2084cac19f61734b51/src/gap_statistic/mod.rs#L90 Otherwise, for now, I would be ok with raising an error if the backend is Rust, and before the next release, I can add to the rust side of things; or maybe even pushing a commit to this PR if I have time tomorrow.
Hi Shay, I put together a patch for the rust stuff that you can apply to your PR.
After addressing the black formatting test failure and the comments, we should be good to go! :+1:
I applied the patch and fixed everything mentioned in your code review, but the rust crate test is still failing everywhere. :(
That is completely my fault. After fixing the rist side I only checked that the Python tests passed. Leaving now for vacation but will have a patch to you this evening. Sorry again.
But just from looking, it seems you can just remove line 66 from src/tests/mod.rs and it should pass
Because I forgot to implement debug for the GapCalc struct. That's all.
Trying this just now!
It now seems to fail on Black being angry about optimalK.py's formatting. I've ran it through Black and committed again.
Passed!🎉
Most excellent! Can you split this into 3-4 commits based on the points this PR was addressing, and I'll merge it in later today hopefully.
How do I do this splitting? :)
4 commits or 4 PRs?
4 commits. If you have troubles doing that, I don't have any strong feelings against squashing and merging as one commit, just a preference to split the commits per issue.
Put it this way, if you don't do it by this evening I'll squash and merge it. 😉👍 But in general, don't know how comfortable you are with git. One would typically do a soft reset on master and then add the chunks as they pertain to given issues. 👍
Sorry, I was off the grid for a couple of days! :)
I don't know how to perform the procedure you described, but I would've learned. Nevermind. :)
Thank you for all the support with this PR!
No worries, I'm on vacation and didn't want to keep you waiting. Had some free time last night so all taken care of! Thanks again. 👌
This pull request is meant to address four issues (three of which I've opened, just for order's sake):
gap_df
attribute. This is done without effecting the results of the calculation.To this end, this pull request changes exactly three files:
.gitignore
- Just adds ignoring of.swp
files (for people working with vim, like myself) and of.DS_Store
files (for people working on mac, like myself).optimalK.py
- Adds:OptimalK._calculate_gap()
method.gap_df
attribute - and additional calculations which cannot be done in a k-specific function - in theOptimalK.__call__()
method.OptimalK.plot_results()
method.README.md
- Adds documentation for the original suggested method (very basic; more of a mention), the package's API and all new features. This is partial, but a good start, I believe.Examples images are below.
An important note is that while this works great on my machine, I was unable to test it as it seems the azure-pipelines-based CI solutions you recently switched to is not working. I think this should be addressed, but you can also just pull this - this code is really safe, doing nothing not already done by the package.
If I can somehow help with getting testing to work again I'd love to land a hand. I would suggest adding travis back as an additional platform to run tests, regardless of what happens with Azure Pipelines; it's free, and you don't have to put the badge, but it can help with development in the meantime.
An example of the enhanced dataframe can be seen here:
And of the new plots here: