Closed thuster closed 11 months ago
Hi @thuster,
Thank you for your attention. I should first share that I have updated M2D to add M2D for Speech and also updated EVAR.
My fixes should include the np.float
issue, but I will confirm it later.
Here are my environmental details.
And the followings are my test scripts. I ran them successfully yesterday.
conda create -n ar python==3.8
conda activate ar
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
pip install -r requirements.txt
You can remove the environment with conda remove -n ar --all
anytime later.
git clone https://github.com/nttcslab/m2d.git
cd m2d
curl -o util/lars.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/util/lars.py
curl -o util/lr_decay.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/util/lr_decay.py
curl -o util/lr_sched.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/util/lr_sched.py
curl -o util/misc.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/util/misc.py
curl -o m2d/pos_embed.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/util/pos_embed.py
curl -o train_audio.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/main_pretrain.py
curl -o train_speech.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/main_pretrain.py
curl -o mae_train_audio.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/main_pretrain.py
curl -o m2d/engine_pretrain_m2d.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/engine_pretrain.py
curl -o m2d/models_mae.py https://raw.githubusercontent.com/facebookresearch/mae/efb2a8062c206524e35e47d04501ed4f544c0ae8/models_mae.py
curl -o m2d/timm_layers_pos_embed.py https://raw.githubusercontent.com/huggingface/pytorch-image-models/e9373b1b925b2546706d78d25294de596bad4bfe/timm/layers/pos_embed.py
patch -p1 < patch_m2d.diff
(in the m2d folder)
git clone https://github.com/nttcslab/eval-audio-repr.git evar
cd evar
curl https://raw.githubusercontent.com/daisukelab/general-learning/master/MLP/torch_mlp_clf2.py -o evar/utils/torch_mlp_clf2.py
curl https://raw.githubusercontent.com/daisukelab/sound-clf-pytorch/master/for_evar/sampler.py -o evar/sampler.py
curl https://raw.githubusercontent.com/daisukelab/sound-clf-pytorch/master/for_evar/cnn14_decoupled.py -o evar/cnn14_decoupled.py
ln -s /my_lab/common/evar/work . (Please create your preprocessed data under work instead)
cp /my_lab/common/evar/evar/metadata/* evar/metadata (Please create your own instead)
cd ..
I hope it works for you too. However, if you still face any issues, please feel free to ask me again without any hesitation.
Thanks for the response! Those updates helped. I think my segfault was a hardware issue on my side. It looks like both the linear and finetuning are working now. The only remaining environment error I got was when I tried to create the metadata. The issue was with this line:
https://github.com/nttcslab/eval-audio-repr/blob/main/evar/utils/make_metadata.py#L20
I replaced this line with import urllib.request and everything worked.
Thanks for the help - looking forward to trying it out.
Hi @thuster,
Thanks for the quick confirmation and for pointing out the urllib issue. I have fixed the issue, and you will be able to create GTZAN and VC1 metadata files. I'm sorry for not being aware of the problem and for taking your time.
I hope our resources work for your progress. Thanks again!
I'm trying to replicate the experiments in this paper. Looks like a cool framework, but installation and testing with m2d+evar is pretty challenging. After pip installing requirements, there appear to be version problems (e.g., np.float, url.requests). After debugging a few obvious things, I'm now getting a segfault.
Can you provide a conda environment or pip reqs with versions? It would also be nice if the installation was a bit more streamlined, although the instructions are pretty accurate if carefully followed.
Thanks!