This PR fixes the Metal build (which was broken after a previous change), adds it to CI (thanks to the new GitHub M1 runners!) and fixes a failure I see locally (but not on CI) whereby the Metal runtime on some (newer? different?) MacOS environments can wrap arbitrary bytes as an MTL::Buffer().
If true, this is awesome! But it does mean that in some cases we can "move" CPU arrays to the GPU without a copy and in some cases we have to copy into page-aligned memory (and that this can only be tested at runtime).
This PR fixes the Metal build (which was broken after a previous change), adds it to CI (thanks to the new GitHub M1 runners!) and fixes a failure I see locally (but not on CI) whereby the Metal runtime on some (newer? different?) MacOS environments can wrap arbitrary bytes as an
MTL::Buffer()
.If true, this is awesome! But it does mean that in some cases we can "move" CPU arrays to the GPU without a copy and in some cases we have to copy into page-aligned memory (and that this can only be tested at runtime).