dmlc / dlpack

common in-memory tensor structure
https://dmlc.github.io/dlpack/latest
Apache License 2.0
905 stars 133 forks source link

[ABI break] Append new fields to DLManagedTensor #105

Closed tirthasheshpatel closed 1 year ago

tirthasheshpatel commented 2 years ago

Alternative to #101 Addresses #34 and #104 Implements the A1 alternative in https://github.com/dmlc/dlpack/issues/104#issuecomment-1090423607

Instead of appending the new fields to the DLTesor which changes the layout of DLManagedTensor in a backward incompatible manner, we can add the fields at the end of DLManagedTensor itself. This way we won't require new structs to maintain backward compatibility.

On Python front: should we add the __dlpack_info__ dunder and a version keyword as proposed in #101? I'd say yes because, without knowing the requested version, the producer (exporter) cannot tell if it can export a certain type of array e.g. NumPy throws an error if one tries to export a readonly array with the old ABI. On the consumer (importer) side, it can't tell which fields are safe to access without knowing what version is being imported.

cc @tqchen @mattip @seberg @rgommers @leofang

mattip commented 2 years ago

I think this need to include the __dlpack_info__ parts of #101 and to provide two structs: a v1 DLManagedTensor and a v2+ DLManagedTensorVersioned. The default will be to return a DLManagedTensor. If the consumer requests a higher version we can return a DLManagedTensorVersioned. Then it is the consumer's responsibility to check the extra fields iff they requested the higher version.

tirthasheshpatel commented 2 years ago

I think this need to include the __dlpack_info__ parts of #101 and to provide two structs: a v1 DLManagedTensor and a v2+ DLManagedTensorVersioned. The default will be to return a DLManagedTensor. If the consumer requests a higher version we can return a DLManagedTensorVersioned. Then it is the consumer's responsibility to check the extra fields iff they requested the higher version.

I thought the reason to append new fields to DLManagedTensor itself was to avoid the requirement of two different structs. Since the layout of DLManagedTensor or DLTensor doesn't change, there should not be any breaks in existing code if the same struct is exported with extra fields. Although, I agree we should add the __dlpack_info__ and version keyword (as we did in #101) so importers know what fields are safe to access and exporters know if they can export a certain type of array.

tqchen commented 1 year ago

closed in favor of #113

We would like to thanks @tirthasheshpatel for your effort on bringing in this change onward