tensorflow / io

Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
Apache License 2.0
706 stars 287 forks source link

Decoding MP3s from in-memory data #811

Closed jjedele closed 4 years ago

jjedele commented 4 years ago

Hi @yongtang,

As mentioned in #758 , I was working on my own operator implementation to decode MP3s. My implementation turned out a bit different than yours - it decodes data from in-memory data instead of a file name, similar to tf.audio.decode_wav and tf.io.decode_image. This is helpful if you have lots of smaller MP3 files inside a TFRecord file or something like this.

Would this be something to add to this repository or does it conflict with the IOTensor design?

https://github.com/jjedele/tf-decode-mp3

yongtang commented 4 years ago

@jjedele That makes a lot of sense! We have "basic" ops in tensorflow/io that works on images, e.g., https://www.tensorflow.org/io/api_docs/python/tfio/image

A basic ops for audio encoding/decoding would be great 👍 Would you like to create a PR for that?

yongtang commented 4 years ago

@jjedele The API could be placed under tfio.audio.decode/tf.io.audio.encode if you prefer.

jjedele commented 4 years ago

@yongtang : Cool! Sure, I'll see if I get to starting one during the weekend. Will probably need some help integrating it correctly with the build, etc.

yongtang commented 4 years ago

@jjedele Let me know if you need any help. And you can start a working-in-progress PR first.

I know bazel is not exactly a convenient tool 😄 , we stay with bazel to conform to TensorFlow core repo's overall build. I will be glad to help with any build issues.

jjedele commented 4 years ago

@yongtang I just tried to build the repo as is, and it already fails on my machine. Something related to libboost.

ERROR: /private/var/tmp/_bazel_jeff/1e1f23125705128f6d775e84570984b9/external/boost/BUILD.bazel:10:1: C++ compilation of rule '@boost//:boost' failed (Exit 1) cc_wrapper.sh failed: error executing command
  (cd /private/var/tmp/_bazel_jeff/1e1f23125705128f6d775e84570984b9/sandbox/darwin-sandbox/1768/execroot/org_tensorflow_io && \
  exec env - \
    PATH='/Users/jeff/anaconda3/bin:/Users/jeff/anaconda3/condabin:/Users/jeff/go/bin:/Users/jeff/bin/google-cloud-sdk/bin:/Users/jeff/Library/Haskell/bin:/Users/jeff/anaconda/bin:/Applications/MacVim.app/Contents/bin:/Users/jeff/bin/scala-2.12.3/bin:~/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/usr/local/git/bin:/Library/TeX/texbin:/Library/Frameworks/Python.framework/Versions/3.7/bin:/Users/jeff/.cargo/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/go/bin:/usr/local/share/dotnet:/opt/X11/bin:/Library/Frameworks/Mono.framework/Versions/Current/Commands:/usr/local/git/bin:/Applications/Xamarin Workbooks.app/Contents/SharedSupport/path-bin:/Users/jeff/bin:/Users/jeff/.local/bin:/Users/jeff/.cabal/bin:/Users/jeff/Applications/ghc-7.10.3.app/Contents/bin' \
    PWD=/proc/self/cwd \
    TF_HEADER_DIR=/Users/jeff/anaconda3/lib/python3.7/site-packages/tensorflow_core/include \
    TF_SHARED_LIBRARY_DIR=/Users/jeff/anaconda3/lib/python3.7/site-packages/tensorflow_core \
    TF_SHARED_LIBRARY_NAME=libtensorflow_framework.2.dylib \
  external/local_config_cc/cc_wrapper.sh -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++0x' -MD -MF bazel-out/darwin-fastbuild/bin/external/boost/_objs/boost/operations.pic.d '-frandom-seed=bazel-out/darwin-fastbuild/bin/external/boost/_objs/boost/operations.pic.o' -fPIC -DHAVE_CONFIG_H -DXXH_PRIVATE_API '-DBOOST_ALL_NO_LIB=1' -iquote external/boost -iquote bazel-out/darwin-fastbuild/bin/external/boost -iquote external/bzip2 -iquote bazel-out/darwin-fastbuild/bin/external/bzip2 -iquote external/xz -iquote bazel-out/darwin-fastbuild/bin/external/xz -iquote external/zlib -iquote bazel-out/darwin-fastbuild/bin/external/zlib -iquote external/zstd -iquote bazel-out/darwin-fastbuild/bin/external/zstd -isystem external/boost -isystem bazel-out/darwin-fastbuild/bin/external/boost -isystem external/bzip2 -isystem bazel-out/darwin-fastbuild/bin/external/bzip2 -isystem external/xz -isystem bazel-out/darwin-fastbuild/bin/external/xz -isystem external/xz/src -isystem bazel-out/darwin-fastbuild/bin/external/xz/src -isystem external/xz/src/common -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/common -isystem external/xz/src/liblzma -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma -isystem external/xz/src/liblzma/api -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/api -isystem external/xz/src/liblzma/check -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/check -isystem external/xz/src/liblzma/common -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/common -isystem external/xz/src/liblzma/delta -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/delta -isystem external/xz/src/liblzma/lz -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/lz -isystem external/xz/src/liblzma/lzma -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/lzma -isystem external/xz/src/liblzma/rangecoder -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/rangecoder -isystem external/xz/src/liblzma/simple -isystem bazel-out/darwin-fastbuild/bin/external/xz/src/liblzma/simple -isystem external/zlib -isystem bazel-out/darwin-fastbuild/bin/external/zlib -isystem external/zlib/contrib/minizip -isystem bazel-out/darwin-fastbuild/bin/external/zlib/contrib/minizip -isystem external/zstd/lib -isystem bazel-out/darwin-fastbuild/bin/external/zstd/lib -isystem external/zstd/lib/common -isystem bazel-out/darwin-fastbuild/bin/external/zstd/lib/common '-D_GLIBCXX_USE_CXX11_ABI=0' -DGRPC_BAZEL_BUILD -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/boost/libs/filesystem/src/operations.cpp -o bazel-out/darwin-fastbuild/bin/external/boost/_objs/boost/operations.pic.o)
Execution platform: @local_config_platform//:host

...

Use --sandbox_debug to see verbose messages from the sandbox
In file included from external/boost/libs/filesystem/src/operations.cpp:72:
external/boost/boost/filesystem/file_status.hpp:40:6: error: redefinition of 'file_type'
enum file_type
     ^
/usr/local/include/boost/filesystem/operations.hpp:173:8: note: previous definition is here
  enum file_type
       ^
In file included from external/boost/libs/filesystem/src/operations.cpp:72:
external/boost/boost/filesystem/file_status.hpp:66:6: error: redefinition of 'perms'
enum perms
     ^
/usr/local/include/boost/filesystem/operations.hpp:199:8: note: previous definition is here
  enum perms

... many similar redefinition erros ...

Any idea what could be going on? Seems like it's somehow mixing a tfio-local and a global installation of boost. I installed Bazel 2.1.0 instead of 2.0.0, but that hopefully shouldn't be the problem?

jjedele commented 4 years ago

Ok, seems I fixed it. I had to uninstall my global libboost via brew as hinted at here: https://github.com/ray-project/ray/issues/6319.

yongtang commented 4 years ago

@jjedele glad it works. Let me know if you encounter any further issues.

yongtang commented 4 years ago

Think this issue has been resolved with latest release. Will close for now but feel free to reopen if there are other issues.