Closed xhochy closed 7 months ago
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
Fails in the estimator build with:
+ bazel build tensorflow_estimator/tools/pip_package:build_pip_package
Starting local Bazel server and connecting to it...
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //tensorflow_estimator/tools/pip_package:build_pip_package (1 packages loaded, 0 targets configured)
Analyzing: target //tensorflow_estimator/tools/pip_package:build_pip_package (44 packages loaded, 283 targets configured)
INFO: Analyzed target //tensorflow_estimator/tools/pip_package:build_pip_package (47 packages loaded, 333 targets configured).
INFO: Found 1 target...
[0 / 8] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/python/estimator/canned/linear_optimizer/BUILD:55:11: Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/canned/linear_optimizer/sharded_mutable_dense_hashtable_py_extracted_tensorflow_estimator_api.json. failed: (Abo
rted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estimator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
[libprotobuf ERROR google/protobuf/descriptor_database.cc:642] File already exists in database: google/protobuf/descriptor.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/python/estimator/BUILD:633:11: Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator:dnn_linear_combined to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/dnn_linear_combined_extracted_tensorflow_estimator_api.json. failed: (Aborted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/e
stimator:dnn_linear_combined) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estimator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
[libprotobuf ERROR google/protobuf/descriptor_database.cc:642] File already exists in database: google/protobuf/descriptor.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
[12 / 75] Extracting tensorflow_estimator APIs for //tensorflow_estimator/python/estimator:export_output to bazel-out/k8-fastbuild/bin/tensorflow_estimator/python/estimator/export_output_extracted_tensorflow_estimator_api.json.; 1s linux-sandbox ... (46 actions, 45 running)
Target //tensorflow_estimator/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: /home/uwe/mambaforge/conda-bld/tensorflow-split_1700044396155/work/tensorflow-estimator/tensorflow_estimator/tools/pip_package/BUILD:18:10 Middleman _middlemen/tensorflow_Uestimator_Stools_Spip_Upackage_Sbuild_Upip_Upackage-runfiles failed: (Aborted): extractor_wrapper failed: error executing command (from target //tensorflow_estimator/python/estimator/canned/linear_optimizer:sharded_mutable_dense_hashtable_py) bazel-out/k8-opt-exec-2B5CBBC6/bin/tensorflow_estimator/python/estim
ator/api/extractor_wrapper --output ... (remaining 6 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
INFO: Elapsed time: 5.216s, Critical Path: 1.16s
INFO: 59 processes: 59 internal.
Bisecting for this error:
2.14.0
✅ 2.15.0
❌ Supposely it is this:
% git bisect bad
7c8a95f2ab9b8996eccf5c33729018a45af467cb is the first bad commit
commit 7c8a95f2ab9b8996eccf5c33729018a45af467cb
Author: Shixin Li <shixinli@google.com>
Date: Fri Sep 22 13:05:26 2023 -0700
Enable cross compilation for PJRT GPU compiler:
1. StreamExecutorGpuCompiler compiles w/o client.
2. Add StreamExecutorGpuExecutable (the unloaded pjrt executable).
3. Load StreamExecutorGpuExecutable to PjRtLoadedExecutable through `Load` API.
PiperOrigin-RevId: 567697879
third_party/xla/xla/client/local_client.h | 2 +
third_party/xla/xla/pjrt/BUILD | 16 ++
third_party/xla/xla/pjrt/gpu/BUILD | 95 +++++++++++-
third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_client.cc | 45 ++++++
third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_client.h | 5 +
.../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler.cc | 108 +++++++++++++
.../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler.h | 15 ++
.../xla/pjrt/gpu/se_gpu_pjrt_compiler_aot_test.cc | 167 +++++++++++++++++++++
.../xla/xla/pjrt/gpu/se_gpu_pjrt_compiler_test.cc | 1 +
.../xla/xla/pjrt/pjrt_stream_executor_client.cc | 39 ++---
.../xla/xla/pjrt/pjrt_stream_executor_client.h | 1 +
.../pjrt/stream_executor_unloaded_executable.cc | 31 ++++
.../xla/pjrt/stream_executor_unloaded_executable.h | 78 ++++++++++
.../pjrt/stream_executor_unloaded_executable.proto | 28 ++++
third_party/xla/xla/service/gpu/BUILD | 14 ++
third_party/xla/xla/service/gpu/gpu_compiler.cc | 15 --
third_party/xla/xla/service/gpu/gpu_compiler.h | 13 +-
.../xla/xla/service/gpu/gpu_target_config.cc | 38 +++++
.../xla/xla/service/gpu/gpu_target_config.h | 41 +++++
19 files changed, 705 insertions(+), 47 deletions(-)
create mode 100644 third_party/xla/xla/pjrt/gpu/se_gpu_pjrt_compiler_aot_test.cc
create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.cc
create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.h
create mode 100644 third_party/xla/xla/pjrt/stream_executor_unloaded_executable.proto
create mode 100644 third_party/xla/xla/service/gpu/gpu_target_config.cc
create mode 100644 third_party/xla/xla/service/gpu/gpu_target_config.h
Maybe one of by bisects took a wrong turn?
I got past the problem by carefully reading the Bazel scripts of riegeli
. Next stop: CUDA.
CUDA builds fail with the following (I have no idea what it means):
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h: In constructor 'absl::lts_20230125::str_format_internal::FormatSpecTemplate<Args>::FormatSpecTemplate(const absl::lts_20230125::str_format_internal::ExtendedParsedFormat<absl::lts_20230
125::FormatConversionCharSet(C)...>&)':
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:171:1: error: parse error in template argument list
171 | CheckArity<sizeof...(C), sizeof...(Args)>();
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:171:63: error: expected ';' before ')' token
171 | CheckArity<sizeof...(C), sizeof...(Args)>();
| ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:147: error: template argument 1 is invalid
172 | CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
| ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:151: error: expected primary-expression before '{' token
172 | CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
| ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:151: error: expected ';' before '{' token
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/bind.h:172:153: error: expected primary-expression before ')' token
172 | CheckMatches<C...>(absl::make_index_sequence<sizeof...(C)>{});
| ^
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h: In instantiation of 'constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ArgumentToConv() [with Arg = long int]':
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/str_format.h:268:156: required by substitution of 'template<class ... Args> using FormatSpec = absl::lts_20230125::str_format_internal::FormatSpecTemplate<absl::lts_20230125::FormatConversionCharSet((ArgumentToC
onv<Args>)())...> [with Args = {long int, const tensorflow::ResourceBase*}]'
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/str_format.h:351:1: required by substitution of 'template<class ... Args> std::string absl::lts_20230125::StrFormat(absl::lts_20230125::FormatSpec<Args ...>&, const Args& ...) [with Args = {long int, const tenso
rflow::ResourceBase*}]'
./tensorflow/core/framework/resource_base.h:44:23: required from here
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: error: no matching function for call to 'ExtractCharSet(ConvResult)'
403 | return absl::str_format_internal::ExtractCharSet(ConvResult{});
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:196:1: note: candidate: 'template<absl::lts_20230125::FormatConversionCharSet C> constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ExtractChar
Set(absl::lts_20230125::FormatConvertResult<(absl::lts_20230125::FormatConversionCharSet)(C)>)'
196 | constexpr FormatConversionCharSet ExtractCharSet(FormatConvertResult<C>) {
| ^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:196:1: note: template argument deduction/substitution failed:
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: note: couldn't deduce template parameter 'C'
403 | return absl::str_format_internal::ExtractCharSet(ConvResult{});
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:201:1: note: candidate: 'template<absl::lts_20230125::FormatConversionCharSet C> constexpr absl::lts_20230125::FormatConversionCharSet absl::lts_20230125::str_format_internal::ExtractChar
Set(absl::lts_20230125::str_format_internal::ArgConvertResult<(absl::lts_20230125::FormatConversionCharSet)(C)>)'
201 | constexpr FormatConversionCharSet ExtractCharSet(ArgConvertResult<C>) {
| ^~~~~~~~~~~~~~
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:201:1: note: template argument deduction/substitution failed:
/home/conda/feedstock_root/build_artifacts/tensorflow-split_1700557095434/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/include/absl/strings/internal/str_format/arg.h:403:43: note: couldn't deduce template parameter 'C'
…
Next one:
tensorflow/core/kernels/cast_op_gpu.cu.cc(32): warning #846-D: this partial specialization would have made the instantiation of class "tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, tsl::float8_e4m3fn, Eigen::half>" ambiguous
external/eigen_archive/Eigen/src/Core/MathFunctions.h(429): error: more than one user-defined conversion from "const tsl::uint4" to "tsl::int4" applies:
function template "ml_dtypes::i4<UnderlyingTy>::operator T() const [with UnderlyingTy=uint8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(52): here
function template "ml_dtypes::i4<UnderlyingTy>::i4(T) [with UnderlyingTy=int8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(42): here
detected during:
instantiation of "NewType Eigen::internal::cast_impl<OldType, NewType, EnableIf>::run(const OldType &) [with OldType=tsl::uint4, NewType=tsl::int4, EnableIf=void]"
(462): here
instantiation of "NewType Eigen::internal::cast<OldType,NewType>(const OldType &) [with OldType=tsl::uint4, NewType=tsl::int4]"
external/eigen_archive/Eigen/src/Core/functors/UnaryFunctors.h(179): here
instantiation of "const NewType Eigen::internal::scalar_cast_op<Scalar, NewType>::operator()(const Scalar &) const [with Scalar=tsl::uint4, NewType=tsl::int4]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(238): here
instantiation of "TargetType Eigen::internal::CoeffConv<SrcType, TargetType, IsSameT>::run(const Eigen::TensorEvaluator<ArgType, Device> &, Eigen::Index) [with SrcType=tsl::uint4, TargetType=tsl::int4, IsSameT=false, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(395): here
instantiation of "Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::CoeffReturnType Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::coeff(Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::Index) const [with TargetType=tsl::int4, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h(174): here
instantiation of "void Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::evalScalar(Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::Index) const [with LeftArgType=Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, RightArgType=const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eig
en::MakePointer>>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(607): here
instantiation of "void Eigen::internal::EigenMetaKernelEval<Evaluator, StorageIndex, Vectorizable>::run(Evaluator &, StorageIndex, StorageIndex, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::G
puDevice>, StorageIndex=Eigen::DenseIndex, Vectorizable=false]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(644): here
instantiation of "void Eigen::internal::EigenMetaKernel(Evaluator, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::GpuDevice>, StorageIndex=Eigen::DenseIndex]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(665): here
instantiation of "void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable, Tiling>::run(const Expression &, const Eigen::GpuDevice &) [with Expression=const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Vectorizable=false, Tiling=Eige
n::internal::Off]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorDevice.h(39): here
instantiation of "Eigen::TensorDevice<ExpressionType, DeviceType> &Eigen::TensorDevice<ExpressionType, DeviceType>::operator=(const OtherDerived &) [with ExpressionType=Eigen::TensorMap<Eigen::Tensor<tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, DeviceType=tensorflow::functor::GPUDevice, OtherDerived=Eigen::TensorConversionOp<tsl::int4, const Eigen::TensorMap<Eigen::Tensor<const tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(32): here
instantiation of "void tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, OUT_TYPE, IN_TYPE>::operator()(const tensorflow::functor::GPUDevice &, tensorflow::TTypes<OUT_TYPE, 1, Eigen::DenseIndex>::Flat, tensorflow::TTypes<IN_TYPE, 1, Eigen::DenseIndex>::ConstFlat, __nv_bool) [with OUT_TYPE=tsl::int4, IN_TYPE=tsl::uint4]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(177): here
external/eigen_archive/Eigen/src/Core/MathFunctions.h(429): error: more than one user-defined conversion from "const tsl::int4" to "tsl::uint4" applies:
function template "ml_dtypes::i4<UnderlyingTy>::operator T() const [with UnderlyingTy=int8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(52): here
function template "ml_dtypes::i4<UnderlyingTy>::i4(T) [with UnderlyingTy=uint8_t]"
bazel-out/k8-opt/bin/external/ml_dtypes/_virtual_includes/int4/ml_dtypes/include/int4.h(42): here
detected during:
instantiation of "NewType Eigen::internal::cast_impl<OldType, NewType, EnableIf>::run(const OldType &) [with OldType=tsl::int4, NewType=tsl::uint4, EnableIf=void]"
(462): here
instantiation of "NewType Eigen::internal::cast<OldType,NewType>(const OldType &) [with OldType=tsl::int4, NewType=tsl::uint4]"
external/eigen_archive/Eigen/src/Core/functors/UnaryFunctors.h(179): here
instantiation of "const NewType Eigen::internal::scalar_cast_op<Scalar, NewType>::operator()(const Scalar &) const [with Scalar=tsl::int4, NewType=tsl::uint4]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(238): here
instantiation of "TargetType Eigen::internal::CoeffConv<SrcType, TargetType, IsSameT>::run(const Eigen::TensorEvaluator<ArgType, Device> &, Eigen::Index) [with SrcType=tsl::int4, TargetType=tsl::uint4, IsSameT=false, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorConversion.h(395): here
instantiation of "Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::CoeffReturnType Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::coeff(Eigen::TensorEvaluator<const Eigen::TensorConversionOp<TargetType, ArgType>, Device>::Index) const [with TargetType=tsl::uint4, ArgType=const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h(174): here
instantiation of "void Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::evalScalar(Eigen::TensorEvaluator<const Eigen::TensorAssignOp<LeftArgType, RightArgType>, Device>::Index) const [with LeftArgType=Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, RightArgType=const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Ei
gen::MakePointer>>, Device=Eigen::GpuDevice]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(607): here
instantiation of "void Eigen::internal::EigenMetaKernelEval<Evaluator, StorageIndex, Vectorizable>::run(Evaluator &, StorageIndex, StorageIndex, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::
GpuDevice>, StorageIndex=Eigen::DenseIndex, Vectorizable=false]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(644): here
instantiation of "void Eigen::internal::EigenMetaKernel(Evaluator, StorageIndex) [with Evaluator=Eigen::TensorEvaluator<const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Eigen::GpuDevice>, StorageIndex=Eigen::DenseIndex]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h(665): here
instantiation of "void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable, Tiling>::run(const Expression &, const Eigen::GpuDevice &) [with Expression=const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, const Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>>, Vectorizable=false, Tiling=Eig
en::internal::Off]"
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorDevice.h(39): here
instantiation of "Eigen::TensorDevice<ExpressionType, DeviceType> &Eigen::TensorDevice<ExpressionType, DeviceType>::operator=(const OtherDerived &) [with ExpressionType=Eigen::TensorMap<Eigen::Tensor<tsl::uint4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>, DeviceType=tensorflow::functor::GPUDevice, OtherDerived=Eigen::TensorConversionOp<tsl::uint4, const Eigen::TensorMap<Eigen::Tensor<const tsl::int4, 1, 1, Eigen::DenseIndex>, 16, Eigen::MakePointer>>]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(32): here
instantiation of "void tensorflow::functor::CastFunctor<tensorflow::functor::GPUDevice, OUT_TYPE, IN_TYPE>::operator()(const tensorflow::functor::GPUDevice &, tensorflow::TTypes<OUT_TYPE, 1, Eigen::DenseIndex>::Flat, tensorflow::TTypes<IN_TYPE, 1, Eigen::DenseIndex>::ConstFlat, __nv_bool) [with OUT_TYPE=tsl::uint4, IN_TYPE=tsl::int4]"
tensorflow/core/kernels/cast_op_gpu.cu.cc(178): here
2 errors detected in the compilation of "tensorflow/core/kernels/cast_op_gpu.cu.cc".
can I ask what your workflow is for this? how do you setup your environment?
can I ask what your workflow is for this? how do you setup your environment?
conda-build
source build_env_setup.sh
git init . && git add . && git commit -m "Initial commit" --no-verify --no-gpg-sign
bash $RECIPE_DIR/build.sh
interesting. thanks!
@conda-forge/tensorflow This is ready for review. I will clean up the patches locally and would start building everything on Friday.
Can you tell us what happened with the protobuf situation? It sounds you're adding some changes related to that, but I don't see the pinned versions changing in the .ci_support files.
Nothing has changed. All the protobuf related errors were down to the protobuf_toolchain
and not the version itself. Once this is merged, I would start working on the unpinned build again.
except perhapse my recurring nit of generating the patches with --no-signature. ;-)
I always forget that option. If you find a way to set that as default globally in git, I would appreciate that.
I pushed the patches also as a branch to https://github.com/xhochy/tensorflow/tree/2.15.0-conda-forge-patches
There were some issues now with the OSX builds but it seems we're fine and I have started the Linux and OSX builds now for all configurations.
Builds are on my uwe.korn-tf-gpu and uwe.korn-tf-experimental channels with the following logs:
@h-vetinari @hmaarrfk Please review/copy ;)
Would the goal be for one of us to do light testing? I'm mostly trying to understand a protocol that we can follow in the future too.
Testing should hopefully be covered by the tests in the feedstock. Otherwise, we should extend that. I think isuruf scanned these logs on whether they used the right OSX SDK. But that was back then when build-locally.py
didn't take care of that.
It's just pretty hard to test hardware acceleration without guaranteed access to the right hardware.
I can scan the logs.
thank you hugely
(as a standby observer) - THANK YOU SO MUCH FOR WORKING ON THIS!
Fixes #352