Closed kegl closed 7 years ago
Merging #84 into master will increase coverage by
2.63%
. The diff coverage is94.55%
.
@@ Coverage Diff @@
## master #84 +/- ##
==========================================
+ Coverage 89.41% 92.04% +2.63%
==========================================
Files 66 67 +1
Lines 1993 2099 +106
==========================================
+ Hits 1782 1932 +150
+ Misses 211 167 -44
Impacted Files | Coverage Δ | |
---|---|---|
rampwf/utils/__init__.py | 100% <100%> (ø) |
:arrow_up: |
rampwf/utils/combine.py | 100% <100%> (ø) |
|
rampwf/prediction_types/base.py | 86.2% <100%> (+40.05%) |
:arrow_up: |
rampwf/kits/ramp_kit.py | 95.23% <100%> (ø) |
:arrow_up: |
rampwf/utils/tests/test_testing.py | 100% <100%> (ø) |
:arrow_up: |
rampwf/utils/command_line.py | 46.93% <25%> (-4.29%) |
:arrow_down: |
rampwf/utils/testing.py | 98.95% <98.3%> (+7.21%) |
:arrow_up: |
rampwf/prediction_types/combined.py | 97.29% <0%> (+21.62%) |
:arrow_up: |
... and 3 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 0d29f33...4557665. Read the comment docs.
@kegl can help me understand the problem you're solving here?
Increasing coverage :)
you have my full support. But why is this increasing it?
np.savetxt
. If it doesn't work (e.g. mars craters), we catch the error and tell it to the users that they should implement a save_y_pred
method in problem.py
. This also allows them to follow the format they want, e.g., in kaggle_seguro
we need to re-read X_test
to get the ids outputted into the submission y_test_pred
.The higher level reason of all of this is to continue porting functionalities of ramp-board
into ramp-workflow
so the starting kits can be tested properly. E.g. BasePrediction.combine
was not covered before, causing ramp-board
crashing even when starting kit passed. But this is only a side effect, the goal is to give these functionalities to the user. The last such thing will be model combination, which we will be able to implement now that we can save the predictions.
One more thing which is getting easier now is to write a script that optimizes/makes a report on CV parameters (number of folds and train/test proportions).
@jorisvandenbossche @aboucaud I'll need help here. Both travis https://travis-ci.org/paris-saclay-cds/ramp-workflow/jobs/294574847 and local pytest crashes when it doesn't find the file mars_craters/data/sample_submission.csv
. It's weird that it is looking for it. It's a file that save_y_pred
in kaggle_seguro/problem.py
requires, but what does this have to do with mars craters?
It's like in this line https://github.com/paris-saclay-cds/ramp-workflow/blob/cv_bagging/rampwf/utils/testing.py#L117 when it doesn't find save_y_pred
in mars_craters/problem.py
, instead of raising AttributeError
, it falls back to the previous call's kaggle_seguro/problem.py
, and crashes there (when can't find data/sample_submission.csv
). Weird.
Indeed, if I take off kaggle_seguro
from the top of the list here, mars craters (and the whole test) runs with no problem. https://github.com/paris-saclay-cds/ramp-workflow/blob/cv_bagging/rampwf/kits/ramp_kit.py#L11
From what I could understand, travis is downloading the mars crater data before the testing of the kaggle_seguro
is complete. Therefore the data_path
variable set to '.' refers to the mars_crater
directory and not the kaggle_seguro
one, hence the trouble we're having.
We need to add some coordination.
It happens also locally, running pytest, but I guess it's the same problem. I could get around it by catching Exception
instead of AttributeError
, but it's somewhat unsatisfying. If what you're saying is true, we may have other sync problems. The good news is that this is strictly a testing issue since in no other places we will loop through different starting kits.
you have no async code. How can this happen?
OK, I found the problem, it was the same that made imp-ed modules unpickleable: we forgot to name the modules.
I think I'm finished. Please @aboucaud @glemaitre @jorisvandenbossche review the code. Also add comments where you think I should comment more, in case some of the additions are unclear. I wouldn't be against some modularization, assert_submission
is pretty long now.
@kegl I like the idea of coloring the output of assert_test_submission to make the overview of the output better, but I think we need a solution that is also compatible with Windows (colored
mentions it is only for linux/unix, and @glemaitre also quickly tested it on windows and it does not seem to work).
There are terminal coloring libraries that can do that I think, but I would personally leave that for a separate PR.
Small miscommunication with @glemaitre: it does not work on Windows, but also does not output any garbage (it just ignores it). So then might be fine to keep, although it would still be nice to have something that works cross-platform
what's the meaning of colors?
I agree that we need to cut up assert_submission
, will work on this.
I made the warning disappear, a bit hacky, see next commit.
Yes, I went overboard with colors. I like that the yellow stands out and shows me where the run started when I scroll back. Then green/blue/red is train/valid/test, and bright means the official score. My main purpose in the beginning was to make the official test scores stand out (bright red). Let's keep this open for now, see how it feels after getting used to it.
Not much code but will need careful review. Make it more pythonesque.