conda-forge / tensorflow-feedstock

A conda-smithy repository for tensorflow.
BSD 3-Clause "New" or "Revised" License
91 stars 81 forks source link

tensorflow 2.15.0 #353

Closed xhochy closed 7 months ago

xhochy commented 8 months ago

Fixes #352

conda-forge-webservices[bot] commented 8 months ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

xhochy commented 8 months ago

Fails in the estimator build with:

+ bazel build tensorflow_estimator/tools/pip_package:build_pip_package
Starting local Bazel server and connecting to it...
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //tensorflow_estimator/tools/pip_package:build_pip_package (1 packages loaded, 0 targets configured)
Analyzing: target //tensorflow_estimator/tools/pip_package:build_pip_package (44 packages loaded, 283 targets configured)
INFO: Analyzed target //tensorflow_estimator/tools/pip_package:build_pip_package (47 packages loaded, 333 targets configured).
INFO: Found 1 target...
[0 / 8] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/python/estimator/canned/linear_optimizer/BUILD:55:11: Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/canned/linear_optimizer/sharded_mutable_dense_hashtable_py_extracted_tensorflow_estimator_api.json. failed: (Abo
rted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estimator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
[libprotobuf ERROR google/protobuf/descriptor_database.cc:642] File already exists in database: google/protobuf/descriptor.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/python/estimator/BUILD:633:11: Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator:dnn_linear_combined to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/dnn_linear_combined_extracted_tensorflow_estimator_api.json. failed: (Aborted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/e
stimator:dnn_linear_combined) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estimator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
[libprotobuf ERROR google/protobuf/descriptor_database.cc:642] File already exists in database: google/protobuf/descriptor.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
[12 / 75] Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator:export_output to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/export_output_extracted_tensorflow_estimator_api.json.; 1s linux-sandbox ... (46 actions, 45 running)
Target //tensorflow_estimator/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/tools/pip_package/BUILD:18:10 Middleman _middlemen/tensorflow_Uestimator_Stools_Spip_Upackage_Sbuild_Upip_Upackage-runfiles failed: (Aborted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estim
ator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
INFO: Elapsed time: 5.216s, Critical Path: 1.16s
INFO: 59 processes: 59 internal.
xhochy commented 8 months ago

Bisecting for this error:

Supposely it is this:

% git bisect bad
7c8a95f2ab9b8996eccf5c33729018a45af467cb is the first bad commit
commit 7c8a95f2ab9b8996eccf5c33729018a45af467cb
Author: Shixin Li <shixinli@google.com>
Date:   Fri Sep 22 13:05:26 2023 -0700

    Enable cross compilation for PJRT GPU compiler:
    1. StreamExecutorGpuCompiler compiles w/o client.
    2. Add StreamExecutorGpuExecutable (the unloaded pjrt executable).
    3. Load StreamExecutorGpuExecutable to PjRtLoadedExecutable through `Load` API.

    PiperOrigin-RevId: 567697879

 third_party/xla/xla/client/local_client.h          |   2 +
 third_party/xla/xla/pjrt/BUILD                     |  16 ++
 third_party/xla/xla/pjrt/gpu/BUILD                 |  95 +++++++++++-
 third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_client.cc |  45 ++++++
 third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_client.h  |   5 +
 .../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler.cc       | 108 +++++++++++++
 .../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler.h        |  15 ++
 .../xla/pjrt/gpu/se_gpu_pjrt_compiler_aot_test.cc  | 167 +++++++++++++++++++++
 .../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler_test.cc  |   1 +
 .../xla/xla/pjrt/pjrt_stream_executor_client.cc    |  39 ++---
 .../xla/xla/pjrt/pjrt_stream_executor_client.h     |   1 +
 .../pjrt/stream_executor_unloaded_executable.cc    |  31 ++++
 .../xla/pjrt/stream_executor_unloaded_executable.h |  78 ++++++++++
 .../pjrt/stream_executor_unloaded_executable.proto |  28 ++++
 third_party/xla/xla/service/gpu/BUILD              |  14 ++
 third_party/xla/xla/service/gpu/gpu_compiler.cc    |  15 --
 third_party/xla/xla/service/gpu/gpu_compiler.h     |  13 +-
 .../xla/xla/service/gpu/gpu_target_config.cc       |  38 +++++
 .../xla/xla/service/gpu/gpu_target_config.h        |  41 +++++
 19 files changed, 705 insertions(+), 47 deletions(-)
 create mode 100644 third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_compiler_aot_test.cc
 create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.cc
 create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.h
 create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.proto
 create mode 100644 third_party/xla/xla/service/gpu/gpu_target_config.cc
 create mode 100644 third_party/xla/xla/service/gpu/gpu_target_config.h

Maybe one of by bisects took a wrong turn?

xhochy commented 7 months ago

I got past the problem by carefully reading the Bazel scripts of riegeli. Next stop: CUDA.

xhochy commented 7 months ago

CUDA builds fail with the following (I have no idea what it means):

/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h: In constructor 'absl::lts_20230125::str_format_internal::FormatSpecTemplate<Args>::FormatSpecTemplate(const absl::lts_20230125::str_format_internal::ExtendedParsedFormat<absl::lts_20230
125::FormatConversionCharSet(C)...>&)':
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:171:1: error: parse error in template argument list
  171 |     CheckArity<sizeof...(C), sizeof...(Args)>();
      | ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:171:63: error: expected ';' before ')' token
  171 |     CheckArity<sizeof...(C), sizeof...(Args)>();
      |                                                               ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:147: error: template argument 1 is invalid
  172 |     CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
      |                                                                                                                                                   ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:151: error: expected primary-expression before '{' token
  172 |     CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
      |                                                                                                                                                       ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:151: error: expected ';' before '{' token
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:153: error: expected primary-expression before ')' token
  172 |     CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
      |                                                                                                                                                         ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h: In instantiation of 'constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ArgumentToConv() [with Arg = long int]':
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/str_format.h:268:156:   required by substitution of 'template<class ... Args> using FormatSpec = absl::lts_20230125::str_format_internal::FormatSpecTemplate<absl::lts_20230125::FormatConversionCharSet((ArgumentToC
onv<Args>)())...> [with Args = {long int, const tensorflow::ResourceBase*}]'
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/str_format.h:351:1:   required by substitution of 'template<class ... Args> std::string absl::lts_20230125::StrFormat(absl::lts_20230125::FormatSpec<Args ...>&, const Args& ...) [with Args = {long int, const tenso
rflow::ResourceBase*}]'
./tensorflow/core/framework/resource_base.h:44:23:   required from here
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: error: no matching function for call to 'ExtractCharSet(ConvResult)'
  403 |   return absl::str_format_internal::ExtractCharSet(ConvResult{});
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:196:1: note: candidate: 'template<absl::lts_20230125::FormatConversionCharSet C> constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ExtractChar
Set(absl::lts_20230125::FormatConvertResult<(absl::lts_20230125::FormatConversionCharSet)(C)>)'
  196 | constexpr FormatConversionCharSet ExtractCharSet(FormatConvertResult<C>) {
      | ^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:196:1: note:   template argument deduction/substitution failed:
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: note:   couldn't deduce template parameter 'C'
  403 |   return absl::str_format_internal::ExtractCharSet(ConvResult{});
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:201:1: note: candidate: 'template<absl::lts_20230125::FormatConversionCharSet C> constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ExtractChar
Set(absl::lts_20230125::str_format_internal::ArgConvertResult<(absl::lts_20230125::FormatConversionCharSet)(C)>)'
  201 | constexpr FormatConversionCharSet ExtractCharSet(ArgConvertResult<C>) {
      | ^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:201:1: note:   template argument deduction/substitution failed:
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: note:   couldn't deduce template parameter 'C'
…
xhochy commented 7 months ago

Next one:

tensorflow/core/kernels/cast_op_gpu.cu.cc(32): warning #846-D: this partial specialization would have made the instantiation of class "tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, tsl::float8_e4m3fn, Eigen::half>" ambiguous

external/eigen_archive/Eigen/src/Core/MathFunctions.h(429): error: more than one user-defined conversion from "const tsl::uint4" to "tsl::int4" applies:
            function template "ml_dtypes::i4<UnderlyingTy>::operator T() const [with UnderlyingTy=uint8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(52): here
            function template "ml_dtypes::i4<UnderlyingTy>::i4(T) [with UnderlyingTy=int8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(42): here
          detected during:
            instantiation of "NewType Eigen::internal::cast_impl<OldType, NewType, EnableIf>::run(const OldType &) [with OldType=tsl::uint4, NewType=tsl::int4, EnableIf=void]"
(462): here
            instantiation of "NewType Eigen::internal::cast<OldType,NewType>(const OldType &) [with OldType=tsl::uint4, NewType=tsl::int4]"
external/eigen_archive/Eigen/src/Core/functors/UnaryFunctors.h(179): here
            instantiation of "const NewType Eigen::internal::scalar_cast_op<Scalar, NewType>::operator()(const Scalar &) const [with Scalar=tsl::uint4, NewType=tsl::int4]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(238): here
            instantiation of "TargetType Eigen::internal::CoeffConv<SrcType, TargetType, IsSameT>::run(const Eigen::TensorEvaluator<ArgType, Device> &, Eigen::Index) [with SrcType=tsl::uint4, TargetType=tsl::int4, IsSameT=false, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(395): here
            instantiation of "Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::CoeffReturnType Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::coeff(Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::Index) const [with TargetType=tsl::int4, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"

external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h(174): here
            instantiation of "void Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::evalScalar(Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::Index) const [with LeftArgType=Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, RightArgType=const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eig
en::MakePointer>>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(607): here
            instantiation of "void Eigen::internal::EigenMetaKernelEval<Evaluator, StorageIndex, Vectorizable>::run(Evaluator &, StorageIndex, StorageIndex, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::G
puDevice>, StorageIndex=Eigen::DenseIndex, Vectorizable=false]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(644): here
            instantiation of "void Eigen::internal::EigenMetaKernel(Evaluator, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::GpuDevice>, StorageIndex=Eigen::DenseIndex]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(665): here
            instantiation of "void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable, Tiling>::run(const Expression &, const Eigen::GpuDevice &) [with Expression=const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Vectorizable=false, Tiling=Eige
n::internal::Off]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorDevice.h(39): here
            instantiation of "Eigen::TensorDevice<ExpressionType, DeviceType> &Eigen::TensorDevice<ExpressionType, DeviceType>::operator=(const OtherDerived &) [with ExpressionType=Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, DeviceType=tensorflow::functor::GPUDevice, OtherDerived=Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(32): here
            instantiation of "void tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, OUT_TYPE, IN_TYPE>::operator()(const tensorflow::functor::GPUDevice &, tensorflow::TTypes<OUT_TYPE, 1, Eigen::DenseIndex>::Flat, tensorflow::TTypes<IN_TYPE, 1, Eigen::DenseIndex>::ConstFlat, __nv_bool) [with OUT_TYPE=tsl::int4, IN_TYPE=tsl::uint4]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(177): here

external/eigen_archive/Eigen/src/Core/MathFunctions.h(429): error: more than one user-defined conversion from "const tsl::int4" to "tsl::uint4" applies:
            function template "ml_dtypes::i4<UnderlyingTy>::operator T() const [with UnderlyingTy=int8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(52): here
            function template "ml_dtypes::i4<UnderlyingTy>::i4(T) [with UnderlyingTy=uint8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(42): here
          detected during:
            instantiation of "NewType Eigen::internal::cast_impl<OldType, NewType, EnableIf>::run(const OldType &) [with OldType=tsl::int4, NewType=tsl::uint4, EnableIf=void]"
(462): here
            instantiation of "NewType Eigen::internal::cast<OldType,NewType>(const OldType &) [with OldType=tsl::int4, NewType=tsl::uint4]"
external/eigen_archive/Eigen/src/Core/functors/UnaryFunctors.h(179): here
            instantiation of "const NewType Eigen::internal::scalar_cast_op<Scalar, NewType>::operator()(const Scalar &) const [with Scalar=tsl::int4, NewType=tsl::uint4]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(238): here
            instantiation of "TargetType Eigen::internal::CoeffConv<SrcType, TargetType, IsSameT>::run(const Eigen::TensorEvaluator<ArgType, Device> &, Eigen::Index) [with SrcType=tsl::int4, TargetType=tsl::uint4, IsSameT=false, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(395): here
            instantiation of "Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::CoeffReturnType Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::coeff(Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::Index) const [with TargetType=tsl::uint4, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"

external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h(174): here
            instantiation of "void Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::evalScalar(Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::Index) const [with LeftArgType=Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, RightArgType=const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Ei
gen::MakePointer>>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(607): here
            instantiation of "void Eigen::internal::EigenMetaKernelEval<Evaluator, StorageIndex, Vectorizable>::run(Evaluator &, StorageIndex, StorageIndex, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::
GpuDevice>, StorageIndex=Eigen::DenseIndex, Vectorizable=false]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(644): here
            instantiation of "void Eigen::internal::EigenMetaKernel(Evaluator, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::GpuDevice>, StorageIndex=Eigen::DenseIndex]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(665): here
            instantiation of "void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable, Tiling>::run(const Expression &, const Eigen::GpuDevice &) [with Expression=const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Vectorizable=false, Tiling=Eig
en::internal::Off]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorDevice.h(39): here
            instantiation of "Eigen::TensorDevice<ExpressionType, DeviceType> &Eigen::TensorDevice<ExpressionType, DeviceType>::operator=(const OtherDerived &) [with ExpressionType=Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, DeviceType=tensorflow::functor::GPUDevice, OtherDerived=Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(32): here
            instantiation of "void tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, OUT_TYPE, IN_TYPE>::operator()(const tensorflow::functor::GPUDevice &, tensorflow::TTypes<OUT_TYPE, 1, Eigen::DenseIndex>::Flat, tensorflow::TTypes<IN_TYPE, 1, Eigen::DenseIndex>::ConstFlat, __nv_bool) [with OUT_TYPE=tsl::uint4, IN_TYPE=tsl::int4]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(178): here

2 errors detected in the compilation of "tensorflow/core/kernels/cast_op_gpu.cu.cc".
hmaarrfk commented 7 months ago

can I ask what your workflow is for this? how do you setup your environment?

xhochy commented 7 months ago

can I ask what your workflow is for this? how do you setup your environment?

hmaarrfk commented 7 months ago

interesting. thanks!

xhochy commented 7 months ago

@conda-forge/tensorflow This is ready for review. I will clean up the patches locally and would start building everything on Friday.

xhochy commented 7 months ago

Can you tell us what happened with the protobuf situation? It sounds you're adding some changes related to that, but I don't see the pinned versions changing in the .ci_support files.

Nothing has changed. All the protobuf related errors were down to the protobuf_toolchain and not the version itself. Once this is merged, I would start working on the unpinned build again.

except perhapse my recurring nit of generating the patches with --no-signature. ;-)

I always forget that option. If you find a way to set that as default globally in git, I would appreciate that.

xhochy commented 7 months ago

I pushed the patches also as a branch to https://github.com/xhochy/tensorflow/tree/2.15.0-conda-forge-patches

xhochy commented 7 months ago

There were some issues now with the OSX builds but it seems we're fine and I have started the Linux and OSX builds now for all configurations.

xhochy commented 7 months ago

Builds are on my uwe.korn-tf-gpu and uwe.korn-tf-experimental channels with the following logs:

xhochy commented 7 months ago

@h-vetinari @hmaarrfk Please review/copy ;)

hmaarrfk commented 7 months ago

Would the goal be for one of us to do light testing? I'm mostly trying to understand a protocol that we can follow in the future too.

xhochy commented 7 months ago

Testing should hopefully be covered by the tests in the feedstock. Otherwise, we should extend that. I think isuruf scanned these logs on whether they used the right OSX SDK. But that was back then when build-locally.py didn't take care of that.

hmaarrfk commented 7 months ago

It's just pretty hard to test hardware acceleration without guaranteed access to the right hardware.

I can scan the logs.

hmaarrfk commented 7 months ago

thank you hugely

yuvipanda commented 7 months ago

(as a standby observer) - THANK YOU SO MUCH FOR WORKING ON THIS!