ahrefs / ocannl

OCANNL: OCaml Compiles Algorithms for Neural Networks Learning
BSD 2-Clause "Simplified" License
69 stars 2 forks source link

Require that tensor nodes in the graph of a tensor are either embedded, or already part of the parent context #272

Closed lukstafi closed 4 months ago

lukstafi commented 4 months ago

A link-time check. Without this invariant, all_host_to_device that's part of sync_run is very broken: it only iterates embedded nodes. With the invariant it's still broken. For constant nodes, #234 fixes it; but for non-constant nodes, we also need to rethink all_host_to_devie and all_device_to_host wrt. non-embedded nodes.

Note that the invariant is related to a note I left somewhere in the docs that if two contexts on a single virtual device share a node but the node is not part of their first common ancestor, the situation is undefined.

lukstafi commented 4 months ago

Since embedded vs. non-embedded subtensors is a tensor-level abstraction, it's hard to do better than offer some more Train-level glue code and have current and future Train code check for the invariant.