parsing-science / pymc3_models

Apache License 2.0
157 stars 24 forks source link

Gaussian naive bayes #7

Closed rlouf closed 5 years ago

rlouf commented 6 years ago

I implemented the Gaussian Naive Bayes algorithm. I added tests, a notebook and updated the docs.

rlouf commented 6 years ago

Closes #4

rlouf commented 6 years ago

Thanks! What linter would you recommend? I've used pyflakes and pylint but they missed most of the things you pointed out.

rlouf commented 6 years ago

Also, not directly related to this: most python packages adopt the snake_case naming convention for files. Wouldn't it make sense to do the same here?

rlouf commented 6 years ago

I implemented most of the requested changes. I am not sure the docstrings in the rest of the package follow PEP8. I left mine as they were; I can follow the rest of the package, and we may address this in an another PR.

parsing-science commented 6 years ago

First of all, I appreciate your enthusiasm for this project. I'm glad you have a lot of ideas to contribute.

However, I would ask that you please follow the contributor guidelines (https://github.com/parsing-science/pymc3_models/blob/master/CONTRIBUTING.rst), particularly the section about following existing conventions. I made thoughtful deliberate decisions about style and the code itself so I'd appreciate if you respected those.

If you want to refactor your code to match the rest of the repo, I'll be happy to take another look at your PR.

rlouf commented 6 years ago

Sorry, I wasn't questioning your choices. Seeing the references to scikit-learn and numpy in the README and CONTRIBUTING, I started writing code with those in ming. I realised after the fact that the conventions adopted here were slightly different. Not a problem for me, I was happy to refactor to match the rest of the repo.

I ran pyflakes and pycodestyle on both the model and tests. All I got were unused variables in the declaration of the model, and lines longer than 79 characters---they are all shorter than 100 characters thought.

twiecki commented 6 years ago

This looks great! @parsing-science are there any outstanding issues you see?

parsing-science commented 6 years ago

I've been busy with some personal stuff in the last few months. I'll try to take a look in the next few weeks.

rlouf commented 6 years ago

Is there anything I can do to help? I started implementing other models, but I would like to have this PR merged before making them PR-ready.

rlouf commented 6 years ago

I'm still waiting for the tests to finish, please wait before reviewing the changes.

rlouf commented 6 years ago

All right, I made all the requested changes and the tests now pass (your fix did the trick). I also added a picture of the graphical model corresponding to Naive Bayes for the sake of being fancy. If this looks ok to you I can do a final rebase to clean this up before merging.

rlouf commented 5 years ago

I (finally) addressed all the issues you raised, rebased my branch onto master and update the date in the changelog. This should be ready to merge if you approve of the changes.

rlouf commented 5 years ago

Cool! I’ll rebase my branch against master, make the changes, clean the history a little bit and it should be ready to merge.

rlouf commented 5 years ago

I juste rebased my branch against master. I tried to clean the history as much as I could but I have to admit my commits were all over the place at the time... I did my best without resorting to heavy git surgery. Good to merge if the tests pass!

parsing-science commented 5 years ago

@rlouf: thanks for doing this! It looks like the linting is failing so I stopped the unittests so you can fix that stuff up first.

rlouf commented 5 years ago

@parsing-science Thanks! Makes sense, I was not using flake8 as a linter, good thing you enforced that with CI! Will make the necessary changes ASAP.

rlouf commented 5 years ago

Done! Waiting for Travis to do its job :)

rlouf commented 5 years ago

Travis fails with:

The job exceeded the maximum time limit for jobs, and has been terminated.

I would temporarily increase the time limit (on master), rebase my branch on master and proceed. Ultimately we would need to specify a smaller sample size for tests whenever large ones are not necessary, but I’d leave that for another PR.

parsing-science commented 5 years ago

@rlouf: Looks like Travis has a limit of 50 minutes for free projects so I can't increase it. Can you confirm that the tests pass locally for you? I agree that the sample number should be fixed in a separate PR.

parsing-science commented 5 years ago

Actually I pulled down your branch and ran some of the tests so I'm going to merge this. Thanks!

rlouf commented 5 years ago

Thanks! This was a great learning experience, thank you for bearing with me. I hope to be able to contribute more models soon.