Closed ryanhammonds closed 4 years ago
These kind of comments should probably go into an issue, for any discussion, and then open PRs to respond (cus this isn't really the right place for comments).
what are the pros / cons to 1 vs 2 DFs. I think at some point someone (Erik?) requested the idea of two, which might have benefits?
what does "add self imports to all docstrings so they can be copy/pasted to run" mean? The doctests do run right, considering that we test this with doctest?
for plotting, we want to be careful adding plt.show
everywhere, I think. It can make for weird interactions when the user tries to use it outside the function, and also if functions get called repeatedly (for example, when we build plots with multiple plot calls). Some modules have a show=True/False
and then call plt.show accordingly, which is a possible option. In the end it comes down to what modes we support, and we don't want to make other modes worse to use.
@TomDonoghue I'll move these points to an issue and reference this PR.
what are the pros / cons to 1 vs 2 DFs. I think at some point someone (Erik?) requested the idea of two, which might have benefits?
It makes thing a little simpler I suppose. The dataframe isn't too large. I wish there was some kind of sub-dataframe we could have with pandas that is shown hidden but can still be accesses.
what does "add self imports to all docstrings so they can be copy/pasted to run" mean? The doctests do run right, considering that we test this with doctest?
If you copy and paste the docstring examples they won't run because the function one copies from isn't imported.
for plotting, we want to be careful adding plt.show everywhere, I think.
Yeah, plt.show will cause the plot to show twice in a notebook. A kwarg may be helpful for those you perfer ipython.
General discussion moved to #76.
@TomDonoghue I did a quick self-review and fixed a few typos. I also added a util helper func to separate df_samples from df_features. This PR should be ready to merge but I'll hold off for now.
After discussing with @nschawor, I have a list of updates to make before releasing 1.0.0. This PR so far addresses returning a single df instead of two.
return_samples
kwarg now adds/removes columns, rather than returning to separate dataframes.