p-lambda / jukemir

Perform transfer learning for MIR using Jukebox!
MIT License
172 stars 22 forks source link

3_extract.sh not generating outputs #6

Closed LiyangTseng closed 2 years ago

LiyangTseng commented 2 years ago

Hi, really appreciate this nice work. I stumbled upon a problem when trying to reproduce the experimental results. After executing 3_extract.sh following the instructions in README, there is nothing inside the representation output folder, say ~/.jukemir/representations/gtzan_ff/jukebox, and the terminal only generated the following texts without indicating errors.

lab812@lab812-Z390-UD:~/jukemir/reproduce$ bash 3_extract.sh 
  0%|                                                                                                                                   | 0/4 [00:00<?, ?it/s]Using cuda True
Downloading from azure
Restored from /root/.cache/jukebox/models/5b/vqvae.pth.tar
0: Loading vqvae in eval mode
lab812@lab812-Z390-UD:

Is it because the hardware does not meet your execution criteria (at least 30GB of RAM and a GPU with at least 12GB)? Thanks for your reply in advance!

rodrigo-castellon commented 2 years ago

That's very odd. Have you tried adding print statements after that one to check where exactly it fails? (I can tell that it fails at some point before making the prior transformer, but I'm not sure what would cause that...)

LiyangTseng commented 2 years ago

Hi @rodrigo-castellon, thank you for the response, I found out that other representation extracting methods (musicnn, choi, etc.) all work, which makes me wonder whether it is due to my hardware does not meet the criteria to inference Jukebox, as stated in the Jukebox repo.

The hps are for a V100 GPU with 16 GB GPU memory. The 1b_lyrics, 5b, and 5b_lyrics top-level priors take up 3.8 GB, 10.3 GB, and 11.5 GB, respectively

Also since I'm kind of new to docker, I was wondering which file were you referring to adding code segments when you say add print statements.

adding print statements after that one to check where exactly it fails

Is executing docker using the following command somehow implicitly runs representations/jukebox/main.py, so I could insert the print statement inside this file to check what went wrong? Or the docker image serves as a blackbox and there is no way to modify it?

docker run \
-it \
--rm \
-v /home/cdonahue/.jukemir/processed/gtzan_ff/wav:/input \
-v /home/cdonahue/.jukemir/representations/gtzan_ff/jukebox:/output \
jukemir/representations_jukebox \
--batch_size 256 \
--batch_idx 1
rodrigo-castellon commented 2 years ago

Sorry, yes it is not exactly straightforward to modify the code, but you might be able to create another image that builds off of this image, which modifies (or allows you to modify, by allowing you to run in interactive mode) the representations/jukebox/main.py file. Currently busy but can get back to you later about what exactly would be the steps to achieve this.

LiyangTseng commented 2 years ago

OK, that would be very helpful! Is this related to the usage of representations/jukebox.dockerfile?

LiyangTseng commented 2 years ago

I actually have another issue when extracting features using musicnn, the following errors pop up. I did find similar solutions for these errors, but it seems that it also requires modifications inside docker. Could you also provide some further instructions to solve this issues?

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])

  0% 0/235 [00:00<?, ?it/s]2022-06-04 07:33:42.725555: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2022-06-04 07:33:42.755915: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.756358: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: NVIDIA GeForce RTX 2070 major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
2022-06-04 07:33:42.756613: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2022-06-04 07:33:42.757485: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2022-06-04 07:33:42.758211: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2022-06-04 07:33:42.758425: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2022-06-04 07:33:42.759361: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2022-06-04 07:33:42.760092: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2022-06-04 07:33:42.762102: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2022-06-04 07:33:42.762239: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.762717: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.763052: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2022-06-04 07:33:42.763318: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-06-04 07:33:42.838388: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.838765: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x50e7f00 executing computations on platform CUDA. Devices:
2022-06-04 07:33:42.838781: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): NVIDIA GeForce RTX 2070, Compute Capability 7.5
2022-06-04 07:33:42.840274: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3699850000 Hz
2022-06-04 07:33:42.840548: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5cd7a60 executing computations on platform Host. Devices:
2022-06-04 07:33:42.840561: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2022-06-04 07:33:42.840700: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.840989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: NVIDIA GeForce RTX 2070 major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
2022-06-04 07:33:42.841019: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2022-06-04 07:33:42.841031: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2022-06-04 07:33:42.841042: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2022-06-04 07:33:42.841069: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2022-06-04 07:33:42.841097: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2022-06-04 07:33:42.841108: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2022-06-04 07:33:42.841120: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2022-06-04 07:33:42.841193: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.841649: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.841955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2022-06-04 07:33:42.841998: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2022-06-04 07:33:42.842689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-06-04 07:33:42.842699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2022-06-04 07:33:42.842704: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2022-06-04 07:33:42.842772: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.843111: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-04 07:33:42.843377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7405 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5)
2022-06-04 07:33:43.866519: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2022-06-04 07:33:43.980245: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2022-06-04 07:33:44.391400: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2022-06-04 07:33:44.391463: W ./tensorflow/stream_executor/stream.h:1995] attempting to perform DNN operation using StreamExecutor without DNN support

  0% 0/235 [00:02<?, ?it/s]
Computing spectrogram (w/ librosa) and tags (w/ tensorflow).. Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: cuDNN launch failure : input shape ([1,1,187,96])
     [[{{node model/batch_normalization/cond/FusedBatchNorm_1}}]]
  (1) Internal: cuDNN launch failure : input shape ([1,1,187,96])
     [[{{node model/batch_normalization/cond/FusedBatchNorm_1}}]]
     [[model/Add_1/_137]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 42, in <module>
    input_path, model="MSD_musicnn_big", extract_features=True
  File "/code/musicnn-516acb2a0ff5ef73f64547898e018e793152c506/musicnn/extractor.py", line 172, in extractor
    is_training: False})
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: cuDNN launch failure : input shape ([1,1,187,96])
     [[node model/batch_normalization/cond/FusedBatchNorm_1 (defined at /tmp/tmp7b53zes7.py:14) ]]
  (1) Internal: cuDNN launch failure : input shape ([1,1,187,96])
     [[node model/batch_normalization/cond/FusedBatchNorm_1 (defined at /tmp/tmp7b53zes7.py:14) ]]
     [[model/Add_1/_137]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'model/batch_normalization/cond/FusedBatchNorm_1':
  File "main.py", line 42, in <module>
    input_path, model="MSD_musicnn_big", extract_features=True
  File "/code/musicnn-516acb2a0ff5ef73f64547898e018e793152c506/musicnn/extractor.py", line 141, in extractor
    y, timbral, temporal, cnn1, cnn2, cnn3, mean_pool, max_pool, penultimate = models.define_model(x, is_training, model, num_classes)
  File "/code/musicnn-516acb2a0ff5ef73f64547898e018e793152c506/musicnn/models.py", line 20, in define_model
    return build_musicnn(x, is_training, num_classes, num_filt_midend=512, num_units_backend=500)
  File "/code/musicnn-516acb2a0ff5ef73f64547898e018e793152c506/musicnn/models.py", line 32, in build_musicnn
    frontend_features_list = frontend(x, is_training, config.N_MELS, num_filt=1.6, type='7774timbraltemporal')
  File "/code/musicnn-516acb2a0ff5ef73f64547898e018e793152c506/musicnn/models.py", line 58, in frontend
    normalized_input = tf.compat.v1.layers.batch_normalization(expand_input, training=is_training)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/layers/normalization.py", line 327, in batch_normalization
    return layer.apply(inputs, training=training)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1479, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/layers/base.py", line 537, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 634, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py", line 146, in wrapper
    ), args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py", line 450, in converted_call
    result = converted_f(*effective_args, **kwargs)
  File "/tmp/tmp7b53zes7.py", line 14, in tf__call
    retval_ = ag__.converted_call('call', super(BatchNormalization, self), ag__.ConversionOptions(recursive=True, force_conversion=False, optional_features=(), internal_convert_user_code=True), (inputs,), {'training': training})
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py", line 356, in converted_call
    return _call_unconverted(f, args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py", line 253, in _call_unconverted
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/normalization.py", line 651, in call
    outputs = self._fused_batch_norm(inputs, training=training)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/normalization.py", line 494, in _fused_batch_norm
    training, _fused_batch_norm_training, _fused_batch_norm_inference)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/tf_utils.py", line 58, in smart_cond
    pred, true_fn=true_fn, false_fn=false_fn, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/smart_cond.py", line 59, in smart_cond
    name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1988, in cond
    orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1814, in BuildCondBranch
    original_result = fn()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/normalization.py", line 491, in _fused_batch_norm_inference
    data_format=self._data_format)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py", line 1329, in fused_batch_norm
    name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 3946, in _fused_batch_norm
    name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()
rodrigo-castellon commented 2 years ago

Hi, Apologies for the late reply, I hope you got this working. For completeness and in case you are still unsure, though, here's what you can do to modify a particular Docker image. Say you want to modify representations_musicnn. What I would do is the following:

  1. Look at what's in the Dockerfile for it, which is located at this link. I would see that the ENTRYPOINT is a command python main.py, which means that this is the script that is eventually getting executed in the end. This means that the rest of the Dockerfile is just setting up the environment to run that script.
  2. I would then create a Dockerfile with contents
    
    FROM jukemir/representations_musicnn:latest

ENTRYPOINT ["bash"]

and then do `docker build -t musicnn_modified .` (do `docker build -t musicnn_modified -f dockerfilename .` if you chose to name your Dockerfile something different).
3. You can then run the built image and use the shell inside, which means that you'll be able to poke around and see what the container sees for yourself, which helps for debugging issues like the ones you've mentioned above. This can be done with (as an example) `docker run --rm -it -v /home/unixusernamehere/.jukemir/processed/gtzan_ff/wav:/input -v /home/unixusernamehere/.jukemir/representations/gtzan_ff/musicnn:/output musicnn_modified --batch_size 256 --batch_idx 0`. At this point, you'll be given a bash prompt, and here if you run `python main.py`, you should be able to reproduce the error that you've been getting. At this point, since you have a terminal shell, you should be able to, as I said, poke around and figure out what's causing the issue. Moreover, this is a particularly nice environment since if you get confused about any changes you've made to your environment and want to start over again, you can just re-run the Docker image. Then, once you figure it out, see the next step.
4. To fix or provide a workaround for the issue, you're probably going to need to run one or more commands (for example, `sed` to patch a file, upgrade a dependency, or something like that). Once you've got the chain of those commands nailed down so that `python main.py` works flawlessly, put them into the Dockerfile as so

FROM jukemir/representations_musicnn:latest

RUN first command RUN second command RUN third command ...

ENTRYPOINT ["python", "main.py"]


Don't forget to change the last ENTRYPOINT line back, as well.

At this point, you should be good to go.

Let me know if you have any more issues.
LiyangTseng commented 2 years ago

Thanks for the detailed reply! Also just curious, if we use the pre-trained weight, it is possible to inference the jukebox representation using only CPU as a feature extraction method?

rodrigo-castellon commented 2 years ago

It is probably possible, but would need a decent amount of engineering work, since the Jukebox codebase itself is written on the assumption that it's always running on GPU (you can see for yourself if you try running the Interacting with Jukebox notebook without a GPU.