Closed zmjones closed 9 years ago
In the absolute worst case, set up a Linux virtual machine with R and everything else in it.
@zmjones can you provide a traceback()
?
Get on hangout later today, we will show you how to find the error. this cannot be hard. But we have the useR tutorial now :)
@zmjones As @mllg mentioned: call traceback()
directly after the error occurs. I think it might be something trivial as some issue in your namespace and you are sure that you haven't used your own mlr fork :wink: ?
turns out it was my .RProfile
. Specifically one of these
options(parallelMap.default.cpus = 8,
parallelMap.default.mode = "multicore",
parallelMap.default.show.info = FALSE)
@mllg and @jakob-r that is how I eventually remembered this. The traceback took me to doResampleIteration
, figured it was some problem with parallelMap, and then it occurred to me that I have an .RProfile
which sometimes ruins my life!
@larskotthoff i actually have a ubuntu vm that i tested this in late last night as well. definitely a good sanity check.
Glad to hear that it's working now!
By the way, thank you very much for the tutorial section on partial dependency plots!!! Looks great.
@schiffner Yep. Let me know if you want me to make any changes. I am going to work on the tutorial some more today.
Thanks. I just read the partial dependence section again. I really like it and have nothing to add.
I already did some proofreading this morning and changed only minor stuff. I also added the new section to the pages configuration in mkdocs.yml
.
If you want to add a new section today please also put it into mkdocs.yml
. That wasn't made very clear in the mlr-tutorial README. I reworked it a bit this morning to clarify what one needs to do to edit a section, add a section, what is done by Travis and so on. Please let me know if anything is missing.
What are your plans today for the tutorial, and in general? I found a reference to a section plot.md
?
Yea I missed that and was going to ask about it. Thanks for the clarification.
So Bernd asked me late last week to add a short plot page which referenced all the plots and talked briefly about the usage pattern. I think he might have wanted me to split out plotvsthresh, so I was going to look to see if that made sense. What do you think?
I think the plot page is a good idea.
About the extra-section about plotThreshVsPerf: I'm not sure. What are your/Bernd's plans for this section?
I will open up an issue in the mlr-tutorial repo regarding restructuring.
Oh ok, didn't know that. I will wait on it then. Will check out the new issue. Thanks!
Ok, I'm on it. When you are working on the tutorial today, just put everything in Advanced.
Ok will do.
What's our general opinion on where the visualization stuff should go? I tend towards including relevant plots on pages that describe the concepts rather than having a separate visualization page. Does anything not fit this model?
I don't disagree with that, and made the same point to Bernd. He thought there should still be a short separate page. I was thinking it would be more like an index than a full page. This all depends on @schiffner's reorganization plans though.
I don't know if there is a general opinion. I opened an issue about the tutorial structure here: https://github.com/mlr-org/mlr-tutorial/issues/6
I guess it depends on how complex the plots are. For example: We have plotLearnerPrediction
on the Predict page, which is fine with me, but I wouldn't want to have partial dependence plots on this page, because it's more complicated.
I'm not opposed to redundancy, i.e. mentioning plots in two places, if it helps the reader.
Concerning the page about the plots: I think that it can't hurt to make clear the generate data -- plot data usage pattern and show 1,2 examples.
My opinion is this, and this is what I discussed with Zach:
There are a couple of different kinds of plots, and ways to talk about them in the tutorial.
a) Plots that a pretty general and stand on their own. They get their own tutorial page. Like part. dep. plots.
b) Either smaller stuff, not so important plots, or plots that really make sense in connection to other API mlr stuff, like plotThreshVsPerf.
This is also our current structure and I dont want to (or better: Julia should not have to) change everything.
What I told Zach to do, so all plots can be found most easily, is to build one extra page, that simply names and references all plots in the tutorial.
On the plotVsThread stuff: If Julia is already working on this, and has some unpushed stuff, Zach should wait, but could later iterate over it briefly?
@berndbischl:
Thx for the clarification.
Thanks to @zmjones there is now a section about mlr's plots in tutorial: http://mlr-org.github.io/mlr-tutorial/devel/html/visualization/index.html
About the threshld-plotVsThresh section: Are you ok with the contents I listed above? What were your plans? Will try to finish/push this soon.
I have been working on the generic permutation importance today.
I just made this another filter.
You would pass a learner, a task, a measure (only one I think would be best), a contrast function, an aggregation function, and the number of permutations to conduct. Do you all think that is too much control or too little?
Related to this is local feature importance, which I didn't really think fit here. This is just the contrast between permuted/unpermuted predictions by observation. For some EDA work I have done I've used a smoother to get an idea of where in the distribution of the target feature the feature was most important. I thought that was a somewhat neat insight and could be generic as well. Maybe generateLocalImportance could be a thing I do later.
@larskotthoff sorry for lagging behind so much on testing of the plots. :/
@zmjones Sounds good.
I talked to hadley the other day on twitter and I think that the saving to svg and then checking the xml is the only feasible idea for testing that ggplot objects draw correctly. I wonder if this shouldn't be a separate package though. I don't know enough about the svg spec to do it yet but am reading now.
I think it'll be fairly simple. The main point of the test would be to verify the presence and attributes of some elements, e.g. for 10 data points in there should be 10 circles. You could approach this purely from an XML point of view (ignoring that it is also SVG) as the point is not to validate the produced SVG. I'm not really familiar with XML processing in R, but finding all elements of a certain type and counting them shouldn't be more than 1-2 lines of code.
Do you have something in mind you want to test that would require more complex operations?
Yea that would be simple I agree. I was thinking more like verifying the positions of things. But now that I think about it the simple version would probably be enough to catch almost all of the things that might go wrong.
I don't think we should even try to verify the position of things, at least not in absolute terms. They depend on the size of the canvas and a test on a small canvas may "look" fine, but break for larger sizes. Also, this is highly subjective. I would only attempt to verify correctness in the sense that all data that should be represented in the plot is represented and similar high-level things.
Ok well that part of svg I understand well enough to start on.
Hi Zach,
I think you mistake assert
with stopifnot
. Here is an example from your code:
checkmate::assert(interaction %in% obj$features)
This does not throw an informative error message. The assert
function is used to combine multiple checkSomething()
functions, see the apparently imperfect documentation.
Besides that: Nice code! Keep the good work up!
Michel
Yea you are right that I confused that. I take it I should just use stopifnot in that case then?
No, we dont use stopifnot. we have assertChoice. That is what you want to do here or? If an assert* does really not exist, use an "if + a readble error message". But you really should not have to do this often.
I am in the process of fixing all of this but am not sure the best way to check this one.
In generateThreshVsPerfData.BenchmarkResult
I want to check that the elements of getBMRPredictions
all have predict.type = "prob"
.
Something like the following?
assert(all(sapply(extractSubList(obj, "predict.type"), function(x) assertChoice(x, "prob")))
or assertSetEqual(extractSubList(obj, "predict.type"), rep("prob", length(obj)))
cc @mllg
I am a bit lost on the svg testing. I read through the spec, but the svg files generated by grDevices::svg
via ggsave
) doesn't seem to fit with that. In particular there are lots of references to particular glyphs for which I cannot seem to find a definition. I'm also having trouble distinguishing (in the XML) what are plot elements we want to test and what are background elements (e.g. the plot grid).
Dont let this slow down your progress too much. If this is hard to do, test what you can with normal testthat (e.g. that code at least runs). For the rest look at plots manually. I guess you have to do that anyway to a certain extend
Ok. I still agree with @larskotthoff that this would be nice to have, as there are instances when the plot draws but is not correct. Right now I test the generation functions well enough, but only that the plot functions generate a plot that does render.
I 100% agree that this would be very nice to have....
Hmm, I'll have a look when I get a chance and try to come up with something.
Ok, here's what I'm thinking of:
library(ggvis)
library(XML)
p = mtcars %>% ggvis(~wt, ~mpg) %>% layer_points()
export_svg(p, "/tmp/test.svg")
doc = xmlParse("/tmp/test.svg")
print(nrow(mtcars))
print(length(getNodeSet(doc, "//svg:g[@class='type-symbol']/svg:path", "svg")))
This is checking that there's a symbol (which are unfortunately generic path
and not circle
elements) for each row in the original data frame. "//svg:g[@class='type-symbol']/svg:path"
is an XPath expression that selects all path
elements directly underneath a g
element that has a class of type-symbol
(which is how vega designates the data layer). Figuring out the specific structure to check for each type of plot may be a bit of work first, but should be straightforward by just looking at the generated SVG in a text editor.
Ok that is what I was trying to do. I am still not following how to do this. I haven't even tried it with ggvis
yet, just ggplot2
. I didn't see any obvious correspondence between the XML and the plot (in the ggplot2 svg file), though obviously there is one.
Can you point me to where in the Vega docs this is written?
Right, the ggplot2 SVG looks much uglier. Looks like there's probably a correlation between the glyphs and elements to plot.
I didn't even have a look at the Vega docs (and I don't think that this is documented). Just have a look at the generated SVG -- if you load it in a browser and then right click -> "Inspect element" it will show you what part of the source corresponds to each element.
Ah duh. I should have known to do that. I am not sure of the utility of testing a ggvis plot like that though, as most are embedded in Shiny apps. I looked at RSelenium which you pointed out to me, but (another apparent duh on my part) we'd need two R processes to do any testing of the app, since the R process running the app is locked. I can do that locally of course but don't know if that is possible on Travis, or how to trigger that automatically from within R whilst running the tests.
I think if I could find out what the glyphs refer to in the ggplot2 svg the xml solution might work.
One thing I haven't done though is to test the data element of the plot object (for ggplot2). I can't imagine a situation in which the plot prints without errors/warnings and no error is detectable by looking at this unless we happen across a bug in ggplot2 itself. Do you agree with this? What sort of bugs could the xml solution catch but wouldn't generate errors otherwise?
The aim would be to catch incorrect settings for the plot parameters, e.g. fill/line colour being set to the wrong thing. This wouldn't generate any errors or warnings, but an incorrect plot. It hopefully won't be too hard to check if there are things with n different colours in the generated plot. Another thing would be labels. Essentially anything that the user can't change themselves and that would cause the plot to be broken in some way if it's missing.
For the actual testing I would get the plot part out of the shiny app and check that (if that's possible without too much hassle).
Ok makes sense. So I can't seem to write svg to file from a the running app yet but I think it is possible. I guess if I can do that then we can skip the whole RSelenium thing and just test that each set of inputs gives the output we want.
Yes, I think that would be preferable. We can talk on hangout about this later today or tomorrow.
Ok cool. I am feeling pretty terrible now so how about tomorrow?
Sure, I'm pretty much free tomorrow. Just let me know on hangouts.
Hey @larskotthoff I can do a hangout anytime now.
I have tests done now for plotPartialPrediction
. Is it ok if I put XML in suggests but not explicitly load it in the base context and instead reference it with "::"? Or should I load it in base and not use "::"?
1) First of all, always use :: if the package is not in depends or imports. Even then you might consider using ::.
2) We really dont want to depend on XML.
3) We cannot avoid XML in SUGGESTS right? Otherwise we dont pase R CMD check or?
In the near future (before May 5) I plan on refactoring
plotROCRCurves
to use plotROC which usesggplot2
instead of base graphics. This package offers some extra functionality (compared to what is available now) which I'll document. I also hope to get at least one other smallish feature done by then.One option would be extending
plotLearnerPrediction
to cases with 3 features. I think the two obvious things to do here are to use one of the 3D plotting packages (I think plot3Drgl is nice). Another thing I'd definitely like to do is to use facetting for the third feature. With a discrete feature this is easy but it might be nice to add the ability to discretize one of the features as well. We could also plot 4 features by using the background color as well. In general it would be possible to layer on additional features in this way but it seems to have diminishing returns in terms of interpretability after 2 or 3 features.Another thing I could possibly do is to add an option to
performance
that lets you apply a measure to a subset of the feature space. I find this very useful for exploring the fit of learners, especially with data that is structured in some way. I haven't looked at the code forperformance
yet so i don't have an idea how much work that would entail. One problem i can see is that if some of the cells of the grouping are small the variance might be quite large. I am not sure whether that is out of the scope of the project. Is this is something others would like to have?When I get back (around May 16-17) I would like to finish up any residual work from the above first. I'd like to talk to Julia/Lars/Bernd about what I do next. I've had my nose in the EDA related functionality lately and so my inclination is to start working on that first. Alternatively I could start work on producing interactive versions of the existing plotting functionality.
I have found some papers recently that I think are worth prioritizing above the minor things in my proposal (dependent data resampling methods and standard error estimates for random forests and other ensemble methods). In particular Hooker 2012 and Drummond and Holte 2006.