koka-lang / koka

Koka language compiler and interpreter
http://koka-lang.org
Other
3.16k stars 151 forks source link

segfault in vector-init #443

Closed chtenb closed 5 months ago

chtenb commented 5 months ago

Reproduction

pub fun vector/append(first : vector<a>, second : vector<a>) : _ vector<a>
  vector-init(first.length + second.length, fn(i) index-two(first, second, i))

fun vector/index-two(first : vector<a>, second : vector<a>, i: int) : _ a
  if i < first.length then first[i] else second[i]

pub fun main()
  val x = [1,2,3].vector
  val y = x.append(x)
  ()
TimWhiting commented 5 months ago

Don't you want second[i - first.length]?

chtenb commented 5 months ago

Yes that's right, but it should not segfault I think? It should throw the appropriate exception and error message in the console.

TimWhiting commented 5 months ago

Yeah, that should have been fixed in this commit: https://github.com/koka-lang/koka/commit/03793a957ee98b542f4ed4a5aefd1ae172f4506e

With the splitting of the core.kk file that code now is: here: https://github.com/koka-lang/koka/blob/68f4cdfa34e1706057cca827ea5f4f6fd59a15a6/lib/std/core/vector.kk#L57 here: https://github.com/koka-lang/koka/blob/68f4cdfa34e1706057cca827ea5f4f6fd59a15a6/lib/std/core/vector.kk#L42 and here: https://github.com/koka-lang/koka/blob/68f4cdfa34e1706057cca827ea5f4f6fd59a15a6/kklib/include/kklib/vector.h#L85

chtenb commented 5 months ago

Hm, I'm on the latest origin/dev which contains that commit

TimWhiting commented 5 months ago

Ahh, I know the issue. Throwing inside of initializing a vector causes the partially initialized vector to be freed including each element in the vector (if they only have a single reference), but not all of it's elements exist yet, so it tries to ref-count invalid memory.

chtenb commented 5 months ago

Are ints inside a vector refcounted individually?

TimWhiting commented 5 months ago

Yes, vectors are not currently specialized to non-refcounted types (as far as I know), and even if they were ints are arbitrary precision integers which use an efficient pointer tagged representation until they overflow: paper on integers in koka. Since they can overflow they do need ref-counting (but when not overflowed they are a value type and don't actually do any ref-counting in practice). int32 or int64 in contrast are non-refcounted primitive machine types, and I believe Koka boxes these types when used in a generic type.

As such, it is best if you just use ints, since they are designed to be efficient at ref counting (no ref counting unless require allocation), and do not need to be boxed / allocated due to the pointer tagging.

This vector issue does need fixing, but I don't know if it should be fixed by specializing drop / free for vectors, or if the initializing function should check for yielding in a final manner and fill the rest with values representations that don't require allocation, but can be ref-counted without causing a segfault. Or if we should initialize the memory with some value type representation that doesn't need ref-counting prior to filling it by executing the user's function. For the integer example this would essentially be the same as just initializing with https://github.com/koka-lang/koka/blob/68f4cdfa34e1706057cca827ea5f4f6fd59a15a6/lib/std/core/vector.kk#L71 prior to mapping over it with the user's function, but for non value types it would be more expensive unless we do it at a lower level.

daanx commented 5 months ago

Thanks! Tim was right, the vector was half initialized and freeing would fail. I fixed it in the latest dev by always initializing such array first with dummy values; since the vector is not reachable during initialization this should be safe.