Finishing up pipelining

As the title says, pipeline parallelism needs to be finished up. A checklist for what needs to be done:

Read/write code for internal queues need to sync when their respective queue finishes.
Implement multiprocessor support and thread safety for scheduler.
Support repeat et al in TaskGen.
- This will need a scheduler call to reset the initialized flag of a task.
Insert commit queues at start of Bind stretch.
- Implement cardinality counting for queue commits in TaskGen.
- Commit the commit queues once in a while from driving computer.
Generate sync code from SyncInfo for computers.
- See Haddock for SyncInfo for info on how this should be done.
- Important to remember that we can't spin wait, so each instance of waiting for something needs to take the form of launching a new task containing if(what_were_waiting_for) proceed();.
Generate task_struct on top level for each task, and have scheduler calls refer to them.
Fix rewritten types. They seem to work, but may not be strictly correct everywhere.

Most of the interesting stuff for all this happens in src/TaskGen, src/Codegen/CgOpt.hs and csrc/{sched,commit_queues}.{h,c}.

dimitriv / Ziria