Strider-CD / strider-simple-runner

Easy-to-configure in-process Runner implementation for Strider.
MIT License
3 stars 18 forks source link

Configurable job parallelism #29

Closed smashwilson closed 8 years ago

smashwilson commented 8 years ago

I'm introducing a runner configuration option, concurrentJobs, to allow multiple jobs to be scheduled in parallel.

knownasilya commented 8 years ago

Good start

niallo commented 8 years ago

Good to see some work on this. The data dirs could be keyed by job ID or similar. Reaping them becomes a little tricky. Some folks may like to re-use build directories in some circumstances too (even though that's not always going to be safe). Just something to think about.

knownasilya commented 8 years ago

I feel like concurrent jobs for the same project in the same branch don't make sense at the moment. Maybe that should be the limitation.

smashwilson commented 8 years ago

I feel like concurrent jobs for the same project in the same branch don't make sense at the moment. Maybe that should be the limitation.

Hmm! That's going to be tricky to manage - an async.queue won't cut it any more, because more commits can arrive on a branch corresponding to an existing job even while the queue is unsaturated.

Concurrent builds on a single branch will be an issue if they trigger deployments, though.

The data dirs could be keyed by job ID or similar. Reaping them becomes a little tricky.

I had been thinking about maintaining an in-memory structure of data directories that are in-use and excluding them from the culling somehow. I think the runner can get away with something in memory because simple-runner jobs don't persist across service restarts anyway - otherwise I'd need to either parse the job ID from the data directory name or use some kind of flag file to identify active data directories.

Some folks may like to re-use build directories in some circumstances too (even though that's not always going to be safe). Just something to think about.

Re-using a data directory from one job to another? I'd thought that was what the cache was intended for. With the job ID in the data directory name, how does that work - or am I misunderstanding?

knownasilya commented 8 years ago

So there are two concepts, the cache and the data dir (as far as I know) and the cache is used by certain plugins but the data dir is where the project is cloned into and built. Personally I always copy the contents from the data dir to /var/www/<project name> when using strider locally.

smashwilson commented 8 years ago

Okay, I've revisited this to:

Here's what the data directory structure looks like:

~/code/strider-simple-runner (parallel-jobs=) 
$ find /Users/ashl6947/.strider/data -type d -maxdepth 2
/Users/ashl6947/.strider/data
/Users/ashl6947/.strider/data/deconst-deconst-control-master
/Users/ashl6947/.strider/data/deconst-deconst-control-master/job-56f29d31a141bc8723e5bf9a
/Users/ashl6947/.strider/data/deconst-deconst-docs-control-master
/Users/ashl6947/.strider/data/deconst-deconst-docs-control-master/job-56f29d30a141bc8723e5bf98
/Users/ashl6947/.strider/data/deconst-deconst-docs-master
/Users/ashl6947/.strider/data/deconst-deconst-docs-master/job-56f29d2fa141bc8723e5bf97
/Users/ashl6947/.strider/data/deconst-deploy-master
/Users/ashl6947/.strider/data/deconst-deploy-master/job-56f29d30a141bc8723e5bf99

I've also tried to be careful in extracting branch names from the job in a sane way because Job's ref field can hold arbitrary things. I'm testing for a few common keys (branch, fetch) but falling back to an ugly-but-safe "URI-encoded stable JSON representation of the whole thing".

knownasilya commented 8 years ago

Do you think this is ready for merge? Did you have any plans to write some basic tests for this JobQueue?

knownasilya commented 8 years ago

Nvm, you have tests.

smashwilson commented 8 years ago

I was just giving it a last look-over to see if there was anything obvious I'd missed. I don't think so :smile:

knownasilya commented 8 years ago

So it's probably a good idea to expose the configuration in Strider UI.

smashwilson commented 8 years ago

Yeah, agreed. The number of recent directories to keep should be simple enough to add. The level of concurrency is less clear to me.

Is there a place for installation-wide (rather than build-specific) configuration for runner plugins?

knownasilya commented 8 years ago

Hum, no but https://github.com/Strider-CD/strider/pull/681 would help when I get more time to finish it.

knownasilya commented 8 years ago

How is it configured now, an env var? Need to document in the mean time.

smashwilson commented 8 years ago

Yeah, CONCURRENT_JOBS.

smashwilson commented 8 years ago

It's a little generic, but I wasn't sure how far to go down the STRIDER_SIMPLE_RUNNER_... prefixing road. I can patch it if there's a pattern I wasn't following.

knownasilya commented 8 years ago

No, this is good. I documented it in the strider repo and updated the version of simple-runner.