csgillespie / benchmarkme

Crowd sourced benchmarking
https://csgillespie.github.io/benchmarkme/
40 stars 13 forks source link

How to retrieve and compare two different results #18

Closed AndreaPi closed 5 years ago

AndreaPi commented 5 years ago

Hi,

compliments for the package! It's great 😄 I was looking for a set of benchmarks to compare different configurations (libraries, multithreading, etc.) of my R setup, when I found about your package, which does all I need and much more. I have a question: I uploaded two sets of results with upload_results(res, args = list(sys_info=FALSE)), with IDs "2018-10-18-46512099" and "2018-10-18-14122665". Is there a way to retrieve them and compare them? I tried to use the benchmarkmeData package, but I couldn't find any results from 2018.

AndreaPi commented 5 years ago

No news on this? What a pity 🙁

csgillespie commented 5 years ago

Sorry. Slipped off my radar. I'll look at it in the next few days.

On Wed, 14 Nov 2018, 19:02 Andrea Panizza <notifications@github.com wrote:

No news on this? What a pity 🙁

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/csgillespie/benchmarkme/issues/18#issuecomment-438779257, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNYI8VyPgMSchueixOg2pO3QVqqYKmZks5uvGjBgaJpZM4Xwy_9 .

csgillespie commented 5 years ago

@AndreaPi Just finishing off a new version. Another day or two.

csgillespie commented 5 years ago

All past benchmarks have been uploaded in 7c77f362619c39a527818c9135fee1e921ba63eb

To access, just use

data(past_results, package="benchmarkmeData")

I'm just about to release a new version. Unfortunately, due to R changing the byte-compile option, it won't be possible to use old results. The results will still be in the package for historical purposes.

csgillespie commented 5 years ago

Just noticed, that travis didn't like the commit. You can download the results directly from

https://github.com/csgillespie/benchmarkme-data/blob/ed749ca9e4497e28423c697b2acb7e81747efab3/data/past_results.RData

and just load them via load("past_results.RData")

AndreaPi commented 5 years ago

Thanks to the updates! I'm a bit confused, though: commit https://github.com/csgillespie/benchmarkme/commit/7c77f362619c39a527818c9135fee1e921ba63eb doesn't show any file changes, so I'm not sure if something has been modified or not. If I understand correctly, I can load the old data in my current session like this:

You can download the results directly from

https://github.com/csgillespie/benchmarkme-data/blob/ed749ca9e4497e28423c697b2acb7e81747efab3/data/past_results.RData

and just load them via load("past_results.RData")

however, I can't compare an old result with a new one b/c of the changes introduced in R 3.4.0 (the JIT compiler). Correct?

Also, slightly different but related: suppose I perform two new tests now and I upload them. Is it possible to compare these two new data, or will it be possible once the new version will be released? Basically, I'm asking to introduce the possibility to filter the online result database by ID.

csgillespie commented 5 years ago

Thanks to the updates! I'm a bit confused, though: commit 7c77f36 doesn't show any file changes,

I messed that bit up (sorry). That's why I gave a link to the results (and hoped you didn't spot the error).

I can't compare an old result with a new one b/c of the changes introduced in R 3.4.0 (the JIT compiler).

The reason for the mess is a bit more complicated. I have always tried to detect the JIT. But R 3.X. introduced the bytecompiler by default. But then R 3.Y noticed a bug, so did use the full JIT, but then 3.Z fixed this.

Just compare. You should be able to spot if the byte compiler was an issue. It would only really affect the "prog" benchmarks.

Basically, I'm asking to introduce the possibility to filter the online result database by ID.

I'm almost finished the new version. So please test and give feedback. In theory the data will be there, so sorting by id can always be added later. Probably not in the first version though.

The thing I've struggled with is privacy. I would like to make it easier for people to retrieve their results, but still maintain their privacy. Perhaps I could use an Renviron flag?

AndreaPi commented 5 years ago

The thing I've struggled with is privacy. I would like to make it easier for people to retrieve their results, but still maintain their privacy. Perhaps I could use an Renviron flag?

Ah, good point. I'm not sure how you would use an Renviron flag to guarantee privacy. I personally don't care about privacy in this context, because as far as I understand, the only really private info is user name and nodename as reported by Sys.info(). However, according to the README, these data are removed when uploaded to the public database:

https://github.com/csgillespie/benchmarkme

What's more, again as mentioned in the README, if one is really fussy about this information, s/he can just set

upload_results(res, args = list(sys_info=FALSE))

to prevent the info from being uploaded in the first place.

In short, I don't see the privacy issue as any issue at all, however that's just me and I cannot speak for the other users. I think the only way to guarantee absolute privacy would be to use two-factor authentication: anyone can upload using a public key, but only s/he can then download her/his uploaded data in the full, unabridged form, using her/his private key. The other users can only download partial information, such as performance test results and maybe R version/platform info, but not the other info. However, it seems a needless complication to me.

AndreaPi commented 5 years ago

Ps I'll definitely test and give feedback :-)

csgillespie commented 5 years ago

@AndreaPi Just uploaded v2. Any comments, just email csgillespie@gmail.com

Thanks

To install

devtools::install_github("csgillespie/benchmarkme-data")
devtools::install_github("csgillespie/benchmarkme")