Non-random IFS kernels; Hutchinson Redesign

I think the best approach for certain objects will actually be to do a counting-based solution as was described here: https://www.youtube.com/watch?v=OjTwzFzxR8M

So in the end, if we want to calculate a 1000 x 1000 square, we would need to calculate ~1,000,000 children. If we run with 256 threads, then we need to calculate ~10 generations (I think I did the math wrong here), which would mean we need to count through 0_000_000_000 in base4 space. Each element represents another possible function choice of the IFS and represents another operation of a specific function.

The idea would be to (using notation introduced in #24):

Determine the number of generations we need to generate
- This can be easily super-sampled by taking 1 generation more, which would mean smear frames on the new FractalLayer will still appear clean and not grainy
Figure out where in counting space each thread is
Give each thread that offset and then a number to count to and calculate the new point positions assigned to it
- Note that this means some threads may have to do a lot more computation than other threads
- Maybe offset this by somehow shuffling the counting array? That could cause undue warp divergence...

Here's the plan, so far, (#29):

[x] Implement a redesign for the final / post Hutchinson operators via counting through fid. This sets the framework for counting
[ ] Implement a new struct for color output so we can output at any step in the process, ie:
```
struct ColorOutput{C, B}
color::C
output::B
```
[ ] Implement the non-random IFS kernel and test with the the rectangle (for super-sampling, etc)
- [ ] Super-sampling
- [x] keep points as well in the FractalLayer and use them for the next iteration? #33
  - [ ] Change number of iterations for each frame in a video if points are kept?
[ ] Compare to High Performance IFS, as noted in #2

leios / Fable.jl

Non-random IFS kernels; Hutchinson Redesign #26