src-d / ml-backlog

Issues belonging to source{d}'s Machine Learning team which cannot be related to a specific repository.
0 stars 3 forks source link

[style-analyzer] Run reports with various HP constraints #1

Closed vmarkovtsev closed 5 years ago

vmarkovtsev commented 5 years ago

Would be nice to measure ~10 dots so that we can plot a graph in the paper.

@zurk consults, @m09 performs. science-3 can bear around 15 parallel instances.

m09 commented 5 years ago

Results are out for the first batch: https://gist.github.com/m09/fb0292645fa12002775cea22bbee838f

They indicate clearly that we should regularize aggressively with min_samples_leaf. I'll launch another batch to go higher than 50 so that we have a better picture of the tradeoff.

zurk commented 5 years ago

Looks like this one is gone? Should we post experiments results here?

vmarkovtsev commented 5 years ago

@m09

m09 commented 5 years ago

Results are available in the paper in the min-samples-leaf.pdf figure.

m09 commented 5 years ago

It seems either GH upload or overleaf download has an implicit conversion step, so it's not pdf but png. (and I can't reupload because it's the same file).

zurk commented 5 years ago

Fixed: