figs update - Githubissues

jovo commented 8 years ago

subtitle font is too big i think
for fig 1: also show "truth", don't show "RerFdn" (we'll have a final figure showing the various modifications)
fig 2 C: why does it take so long?
fig 2 B & E: i thought RerF did better than sparse parity & trunk in these simulations?
fig 3: i thought we decided that each panel would be a transformation, to demonstrate that RerF does better than RF and RotF in each?
fig 4: is that for the real data or the sims? in each case, maybe start the y-axis at 0.5 or even higher?

tyler-tomita commented 8 years ago

I've made a number of revisions to the figs and addressed your questions:

all figs: subtitle font is smaller. do you think it's still too big?
fig 1
- removed "RerFdn"
- plotted true class posteriors "f_Y|X" w/ different colormap and colorbar scale
fig 2
- panels C and F: RerF was running significantly slower than the others because of the way it was counting the total number of split variables for each tree. I removed this code completely as it isn't a necessary thing to do. I was doing it because in the past we had talked about using it as a measure of complexity.
- panels B and E: added Bayes error. the errors for Rotation RF might be wrong - it appears to be lower than Bayes in panel E. i'm still trying to resolve this.
fig 3
- added Trunk in panels F-J
- now each panel corresponds to a particular transformation. RerF does better than RF in each case but not better than RotRF in each case, particularly when the data is rotated. However, while RerF is still affected by rotation, panel G suggests that it isn't affected to the same extent as RF when d=10,50.
- x axis in panels A-E are now log scale and only go up to d=50 (before it went up to 100) to make the lines easier to distinguish.
fig 4
- this is for the real data
- the y-axes start higher up. in the caption or in the paper we should probably mention the y-intercepts, as it indicates the proportion of times each classifier was the best.

jovo commented 8 years ago

great!

fig 1
- i think we can remove the numbers on the xaxis and yaxis, right?
- i'd label "(D) Posterior" maybe?
- for the top colorbar, i'd only have about 4 numbers, like the bottom
- i'd replace 'x1' with 'x_1' and 'x2' with 'x_2'
- maybe make panel (D) the first panel?
fig 2
- for B & E, can we plot L_X - L_RF, for X \in {RerF, RotRF}, ie, the relative error compared with random forest? i think that might be more clear. let's still clear up that RotRF is beating Bayes
- i think this fig might be better transposed, so there are 2 columns (one per setting), then you could not repeat the titles 3x, and the ylabels 2x.
- maybe trunk only needs to go up to 50?
- maybe the yaxis on B & E should be log scale? not sure, it just doesn't look very impressive.
fig 3
- i think nearly perfect! i'd remove Rerfd here, because we'll have a fig 5 with various bells & whistles
- i can't tell whether log on xaxis is good here or not?
- i don't see panels F-J
fig 4
- why not make it 4 across, a la fig 3?
- i'd remove RerFd here too
- is there any way to make affine have more 'signal'
- can we think of more semantic x & y labels?
outlier might be that we resample the data, but permute the indices, not really sure if that is an outlier?
fig 5
- something that shows RerFd, rank RerFd, and maybe some other things? not sure what exactly it should be.

tyler-tomita commented 8 years ago

fig 1: truth in upper left
all relevant figs: standardize algorithm color scheme
subpanel labels (eg, (A)) can go in top left of each figure, because 'title' will now just be the title, and will be consistent across multiple panels.
put 'scenario' name on left side of each row.
fig 2
- more iterations
- RerF and RF up to 1000 for trunk
- make panels B and C log scale
- add RF in panels B & E
- put scenario name on left side, rather than top
fig 3
- remove legend
fig 4
- change y-axis label to "proportion"
- add second row of performance profiles for RF and RerF on (just remaining)
- make legend indicating AUC for each algorithm
fig 5
- Scatter plots of L_RF vs L_variant for all variants (RerF, RerFd, robust RerFd, RotRF) on untransformed benchmarks. (robust RerFd = rankRerFd?).
fig 6?
- training time performance profiles a la fig 4

jovo commented 8 years ago

tyler-tomita / RandomerForest