henrygouk / dopt

A numerical optimisation and deep learning framework for D.
https://henrygouk.github.io/dopt/
BSD 3-Clause "New" or "Revised" License
29 stars 2 forks source link

implement CPU uniform #8

Closed WebFreak001 closed 6 years ago

WebFreak001 commented 6 years ago

before implementing more kernels I have a few questions about the current design:

why the initialize function instead of shared static this?

currently not all unittests are passing for me (not implemented CPU stuff and my GPU not being nvidia), should we add a check to CUDA unittests if CUDA is supported and implement the CPU stuff before doing the other stuff?

The OpDef stuff looks useful for testing, is it used for CPU kernels yet? I implemented the uniform function first thinking it was a ubyte[] instead of float[] but no additional test failed. I'm not sure if I used the Buffer correctly either as I don't quite understand where it comes from yet.

henrygouk commented 6 years ago

why the initialize function instead of shared static this?

An earlier version did used shared static this, but I had to change it due to some reason that I have since forgotten.

currently not all unittests are passing for me (not implemented CPU stuff and my GPU not being nvidia), should we add a check to CUDA unittests if CUDA is supported and implement the CPU stuff before doing the other stuff?

This is one symptom of a slightly larger problem: currently this library is pretty much only useable if the user has CUDA installed. While it is true that a few small things can be accomplished using CPU only code, there are quite a few operations that are only implemented using CUDA. Also, all the systems that I use for testing have CUDA installed, so I'm not likely to pick up bugs such as these. I do intend to set up some sort of CI to alleviate this.

The OpDef stuff looks useful for testing, is it used for CPU kernels yet?

OpDef is used when constructing an Operation graph, rather than during execution of the graph. You can think of an Operation graph as kind of like an abstract syntax tree---it defines the computation rather than performing it.

A few of your queries are symptomatic of a lack of documentation for the inner workings of the library, which is something I plan to improve one the design is a bit more stable.