gvegayon / parallel

PARALLEL: Stata module for parallel computing
https://rawgit.com/gvegayon/parallel/master/ado/parallel.html
MIT License
117 stars 26 forks source link

Continuous-integration #18

Closed bquistorff closed 8 years ago

bquistorff commented 9 years ago

It would be nice if there was continuous integration (CI) to automatically check changes for test errors. Though I am not a lawyer, reading the single-user license terms it seems that as one can put Stata on the CI server in a way that it can't be shared with other users, that it can count as one of the three machines allowed.

The most common CI services are Travis (for Linux) and Appveyor (for Windows). One could setup a personal server with the Stata install files that are only accessible with a secret username/password. Travis could have the login info encrypted and stored so that for tests, Stata could be installed from the personal server. Appveyor has similar functionality.

Any thoughts?

gvegayon commented 9 years ago

Sounds like a really good idea. Actually, the test.do's idea was exactly that. I don't have experience with CI (first time I heard of it), but it does sounds like something we should do here. Only one question, what's the big difference with the development of a cron task that checks compilation and test files every time that the repo has changed??

On the licenses, right now I don't have a personal copy of stata, but I might be able to get one in a month or two. Do you have an available copy for this??

bquistorff commented 9 years ago

The CI will test changes to head no matter who commits to (whereas a cron you setup would have to figure out when I made additions, pull, and then re-run tests). The CI can can then publish stats for the tests for all to see. See the 'build status' badget on pandas. CIs can also test pull requests (instead of having to checkout a pull request locally to test it). See an example pull request.

I don't have an individual license. No rush, though, I just thought it would be a nice idea.

bquistorff commented 9 years ago

I wrote up how to do it if you're curious. Since it needs encrypted information (in order to install Stata without letting everyone else use your copy) Travis will only check pushes by contributors not pull-requests from forks.

gvegayon commented 9 years ago

That's cool, I'll give it a look.

In other matters, yesterday I installed parallel in a Windows machine and it worked perfectly. Since, as far as I remember, no changes have been made to the OSX or Unix side of the code, I think it is ready to be sent to SSC. Only one small change, in the list of authors I didn't knew if you were ok including your email and website, if you are please modify the sthlp file.

In order to push it to SSC it'll be nice if you can provide an mlib compiled in stata 12 or 13 (I think 13 is already ok).

Let me know your thoughts

George G. Vega Yon +1 (626) 381 8171 http://www.its.caltech.edu/~gvegayon/

On Wed, Aug 19, 2015 at 3:28 PM, Brian Quistorff notifications@github.com wrote:

I wrote up http://bquistorff.blogspot.com/2015/08/continuous-integration-with-stata.html how to do it if you're curious. Since it needs encrypted information (in order to install Stata without letting everyone else use your copy) Travis will only check pushes by contributors not pull-requests from forks.

— Reply to this email directly or view it on GitHub https://github.com/gvegayon/parallel/issues/18#issuecomment-132807630.

bquistorff commented 9 years ago

OK, just pushed with some of my info. The committed mlib is with version 12.1 (I've been doing that each time now). Does that seem good?

gvegayon commented 9 years ago

That'll do the job. I have made some small changes on version number and on the sthlp file. I have the idea that maybe it may be good to rewrite the description as well so is more user friendly/sexy (if you know what I mean), what do you think? BTW, I forgot to tell you that what I use for versioning is 0.yy.mm.dd if the version is beta, 1.yy.mm.dd otherwise.

George G. Vega Yon +1 (626) 381 8171 http://www.its.caltech.edu/~gvegayon/

On Wed, Aug 19, 2015 at 5:55 PM, Brian Quistorff notifications@github.com wrote:

OK, just pushed with some of my info. The committed mlib is with version 12.1 (I've been doing that each time now). Does that seem good?

— Reply to this email directly or view it on GitHub https://github.com/gvegayon/parallel/issues/18#issuecomment-132840039.

bquistorff commented 9 years ago

Sure, a "spiced up" description sounds great.

gvegayon commented 9 years ago

Hey Brian,

Here is an idea, what do you think? (please change/add/remove whatever you want)

Parallel lets you run Stata faster, sometimes faster than MP itself. By organizing your job in several Stata instances, parallel allows you to work with out-of-the-box parallel computing. Using the the 'parallel' prefix, you can get faster simulations, bootstrapping, reshaping big data, etc. without having to know a thing about parallel computing. With no need of having Stata/MP in your system, parallel has showed to dramatically speedup computations up to two, four, or more times depending on how many cores your processor has.

George

George G. Vega Yon +1 (626) 381 8171 http://www.its.caltech.edu/~gvegayon/

On Thu, Aug 20, 2015 at 7:15 AM, Brian Quistorff notifications@github.com wrote:

Sure, a "spiced up" description sounds great.

— Reply to this email directly or view it on GitHub https://github.com/gvegayon/parallel/issues/18#issuecomment-133025873.

bquistorff commented 9 years ago

Sounds good.

gvegayon commented 9 years ago

Great, tomorrow I'll try parallel unsing a Mac. After that I'll send it to ssc

bquistorff commented 8 years ago

I had a clarification about how you want to do version numbers. Do you only update a code file's version to the current if it was changed in that release? I'm assuming that one should always update parallel.ado even if it wasn't touched because often people look for the version by doing which parallel.

NB: I wanted to clarify this before I bump the version on the code changes I did today (I did changed the dist-date anyways).

gvegayon commented 8 years ago

George G. Vega Yon +1 (626) 381 8171 http://www.its.caltech.edu/~gvegayon/

On Wed, Apr 20, 2016 at 7:16 AM, Brian Quistorff notifications@github.com wrote:

I had a clarification about how you want to do version numbers. Do you only update a code file's version to the current if it was changed in that release? I'm assuming that one should always update parallel.ado even if it wasn't touched because often people look for the version by doing which parallel.

  1. Yes
  2. Sounds good. Let's start changing that whenever a change is made.

NB: I wanted to clarify this before I bump the version on the code changes I did today (I did changed the dist-date anyways).

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/gvegayon/parallel/issues/18#issuecomment-212443989

gvegayon commented 8 years ago

Hey, about this https://github.com/bquistorff/Stata-modules/blob/master/.travis.yml and this http://bquistorff.blogspot.com/2015/08/continuous-integration-with-stata.html can we use that too here? It'll be awesome and I understand that your commits would be the only ones that wouldn't fail with travis (since yours would be the ones that actually run Stata).