Closed shixzie closed 7 years ago
the Split() methods takes a percent (value between 0 and 1), calling Series.Split(0.5)
would split the series in two equal parts (50%) the Series on which the func was called (let's call it S
) will contain the first 50% and the returned Series S'
would have the other 50%. by calling Series.Split(0.3) S
would have the first 30% and S'
the other 70%
I want to be a contributor :)
Thanks @svanharmelen for your review and opinion :+1:
Thank you very much for you contribution @Shixzie, and thank you too @svanharmelen for your review. Before getting into the actual review, I have to say that there are way too many things happening on this PR which make it very difficult to review:
nil
checks on some instancesDataFrame.Split()
and Series.Split()
methodsFor future contributions I would ask you to keep each major change as a separate PR. Ideally you would branch dev
for every feature and then you will target that specific branch to be merged back into dev
. This is a good read: http://nvie.com/posts/a-successful-git-branching-model/
Also before performing any PR, make sure you have your code formatted with gofmt
, you have test coverage for at least the happy path, pass golint
and go vet
linters, and check that your code follows the style of the existing codebase.
Now I'm gonna put here my thoughts about each of the changes.
Although I feel that a better error trace would be interesting, as referenced in #7, I don't think that the way it was implemented here was ideal. First of all, I feel that the creation of a helper function for createErr
feels a bit out of place, being that all it does is calling errors.Wrap
. Before merging something like this, I feel it's better if we would discuss how this is to be done in the #7 thread.
nil
checks on some instancesIn some cases this might be useful, like on the DataFrame.Set
method, but I saw a number of times that you check both len(x) == 0
and x == nil
. This is not necessary as @svanharmelen pointed out, since len(x)
will already return 0 on a nil
slice.
This is great, I also have been working on this, so I guess it's my bad for not marking the issue as assigned to me, but wasn't expecting any contributions. In any case, I'm going to share what I had so far, you can put it together with your code and we can review it on a separate PR: https://gist.github.com/kniren/f961e32a811b305b5442bae303f5d8b9
Regarding the time parsing when converting from string, this is a tricky matter, but I feel we should settle for a given format for automatic conversion (I favour const timeformat string = time.RFC3339
) and encourage users to either use that or pass an existing time.Time
.
DataFrame.Split()
and Series.Split()
methodsThis is a good idea, although I'm not sure it's the right implementation for the Split method. Essentially for this use case (Splitting by percentage), you can just create a []int
index with randomized elements and then make the splitting from there. When talking about splitting on #13, the expected behaviour is to split a DataFrame to N
DataFrames by a number of factors. This is consistent with what a user from pandas
or R
would expect. In any case, I still don't know which should be the best behaviour for #13, so I encourage you to discuss the pros and cons on that issue thread before jumping to code. I do see the need and the benefit of having a percentage split on Gota
but I'm just not sure this is the API that I would like to expose, furthermore we can continue the discussion on this matter.
With all that said, I would like to thank you once again for your contribution. I'm going to close this issue, but I encourage you to continue the discussion and to submit individual PRs for the appropriate topics after considering my feedback.