mmaul / clml

Common Lisp Machine Learning Library
Other
259 stars 36 forks source link

Rationale for OPTIMIZE declarations? #51

Open lukego opened 2 years ago

lukego commented 2 years ago

I'm taking a look at CLML for the first time. It looks really extensive!

The first routines I've looked at have scared me a bit though by using (DECLARE (OPTIMIZE (SPEED 3) (SAFETY 0) (DEBUG 0))). This puts a lot of responsibility on the caller, at least with SBCL, because passing arguments of the wrong kind will lead to undefined behavior e.g. heap corruption, floating point exception, segmentation fault, etc.

Is there some specific strategy for which functions have such declarations and how they are supposed to be called safely?

Specifically I first noticed this with CLML.STATISTICS.RAND:NORMAL-RANDOM which is an attractive-looking public API function that misbehaves when called with non-double-float arguments:

;; Good
BO> (clml.statistics.rand:normal-random 0d0 1d0)
-1.1053788844727181d0

;; Bad: omitting mandatory args, returns non-random numbers
BO> (clml.statistics.rand:normal-random)
8.21147389424742d62
BO> (clml.statistics.rand:normal-random)
8.21147389424742d62
BO> (clml.statistics.rand:normal-random)
8.21147389424742d62

;; Bad: integer args, segfault
BO> (clml.statistics.rand:normal-random 0 1)
; Evaluation aborted on #<SB-SYS:MEMORY-FAULT-ERROR {100D9CB5D3}>.

;; Bad: single-float args, segfault
BO> (clml.statistics.rand:normal-random 0.0 1.0)
; Evaluation aborted on #<SB-SYS:MEMORY-FAULT-ERROR {100DB6BB03}>.

This feels too much like living dangerously to me. I'm tempted compile CLML using SB-EXT:RESTRICT-COMPILER-POLICY to prevent SAFETY/DEBUG from going down to zero. Then perhaps if I need to optimize this code aggressively I could allow more aggressive optimization in the caller and declare the call as inline. That way the application would be choosing how to make the speed/safety trade-off rather than inheriting very aggressive defaults.

Does that make sense? I am mostly trying to understand how an application programmer is supposed to think about calling this code when it has such aggressive speed-over-safety optimizations baked in.

mmaul commented 2 years ago

So the optimize strategy...that was mostly inherited from the original MSI codebase. Originally being an internal code I expect alot of responsibility might be placed on the caller because the caller was quite familiar with the code. Now that it is out in the wild it would probably be wise to optimize for safety and stability. I will consider this...