facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.38k stars 6.4k forks source link

data2vec inference #4152

Closed Alexadar closed 2 years ago

Alexadar commented 2 years ago

Hi!
I want to try inference on data2vec. There is a lack of documentation how to do it on pretrained models.

How to get masked output masked value for an input value for example for pretrained model? “ Archaeoindris fontoynontii is an extinct giant lemur and the largest primate known to have evolved on Madagascar, comparable in size to a male gorilla. . It belonged to a family of extinct lemurs known as "sloth lemurs" (Palaeopropithecidae)”

Appreciated for any help.

alexeib commented 2 years ago

sorry i dont get the question, are you talking about audio or nlp model here? based on your example i assume nlp? if so you can follow examples in roberta readme, it largely still applies except there is no "lm head" since we dont predict input units

Alexadar commented 2 years ago

sorry i dont get the question, are you talking about audio or nlp model here? based on your example i assume nlp? if so you can follow examples in roberta readme, it largely still applies except there is no "lm head" since we dont predict input units

Yes, about NLP model. Thanks, will take a look. Does it straightforward or there is some specific details?

Jiltseb commented 2 years ago

@alexeib On a similar note, how exactly to run the data2vec inference on custom audio files? As per the docs, I can evaluate the CTC model by specifying the data manifests and other parameters. What should be the field "dataset.gen_subset=dev_clean,dev_other,test_clean,test_other" since I am not using librispeech for evaluation? It is a mandatory field and leads to errors if unspecified:

python examples/speech_recognition/new/infer.py --config-dir examples/speech_recognition/new/conf --config-name infer task=audio_finetuning task.data=manifest_file.tsv common.user_dir=examples/data2vec task.labels=ltr decoding.type=viterbi common_eval.path=<data2vec_speech_model>.pt decoding.beam=1500 distributed_training.distributed_world_size=2

Output error: AssertionError: Unexpected type for root: NoneType (since the dataset is not specified, I guess)

Any help is really appreciated!

alexeib commented 2 years ago

gen_subset is the name of the manifest file for the subset you wish to evaluate. e.g. if you have valid.tsv as the subset you are evaluating, you would specify gen_subset=valid

Jiltseb commented 2 years ago

Thanks. What is the expected format? I have a train and validation split .tsv files created from the wav files with examples/wav2vec/wav2vec_manifest.py and when I run the following, I still get the same error: CUDA_VISIBLE_DEVICES=2,4 python examples/speech_recognition/new/infer.py --config-dir examples/speech_recognition/new/conf --config-name infer task=audio_finetuning task.data=path_to_train.tsv common.user_dir=examples/data2vec task.labels=ltr decoding.type=viterbi decoding.unique_wer_file=True dataset.gen_subset=valid common_eval.path=/path_to_audio_base_ls_960h.pt decoding.beam=1500 distributed_training.distributed_world_size=2 For instance, In train.tsv file I only have 10 audiofiles and 3 in valid.tsv. Am I missing something here? I am putting the audiofiles in a folder and the tsv files outside of it.

alexeib commented 2 years ago

whats the stack trace ? i assume you correctly set the path here: task.data=path_to_train.tsv right ?

Jiltseb commented 2 years ago

Hi @alexeib,

Thanks for your answer. Here is the stack trace:

Traceback (most recent call last): File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report return func() File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 378, in <lambda> lambda: hydra.run( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 88, in run cfg = self.compose_config( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 559, in compose_config cfg = self.config_loader.load_configuration( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 141, in load_configuration return self._load_configuration_impl( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 262, in _load_configuration_impl ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg) File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 378, in _apply_overrides_to_config OmegaConf.update(cfg, key, value, merge=True) File "/home/jilt/anaconda3/lib/python3.8/site-packages/omegaconf/omegaconf.py", line 730, in update assert isinstance( AssertionError: Unexpected type for root: NoneType

task.data is set to the path to train.tsv and dataset.gen_subset is set to 'valid' where valid.tsv is also in the same directory as train.tsv The error is still showing for the dataset data class field.

alexeib commented 2 years ago

hmm this stack trace seems pretty generic, can you share the entire output of what happens when you run the command? and just to make sure, you are using one of the fine-tuned models, not the "no finetuning" model right ?

Jiltseb commented 2 years ago

@alexeib Right. I am using the data2vec model finetuned on 960h Librispeech (audio_base_ls_960h.pt). Here is the complete output: (base) jilt@ada:~/workspace/self-supervised-speech-recognition/fairseq$ CUDA_VISIBLE_DEVICES=4 python examples/speech_recognition/new/infer.py --config-dir examples/speech_recognition/new/conf --config-name infer task=audio_finetuning task.data=/data/jilt/datasets/mydata/train.tsv common.user_dir=examples/data2vec task.labels=ltr decoding.type=viterbi decoding.unique_wer_file=True dataset.gen_subset=valid common_eval.path=/data/jilt/ASR/data2vec/audio_base_ls_960h.pt decoding.beam=1500 distributed_training.distributed_world_size=2

/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/core/default_element.py:122: UserWarning: In 'infer': Usage of deprecated keyword in package header '# @package _group_'. See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/changes_to_package_header for more information deprecation_warning( /home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'infer': Defaults list is missingself. See https://hydra.cc/docs/upgrades/1.0_to_1.1/default_composition_order for more information warnings.warn(msg, UserWarning) examples/speech_recognition/new/infer.py:478: UserWarning: 'infer' is validated against ConfigStore schema with the same name. This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2. See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions. hydra_main() # pylint: disable=no-value-for-parameter Traceback (most recent call last): File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report return func() File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 378, in <lambda> lambda: hydra.run( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 88, in run cfg = self.compose_config( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 559, in compose_config cfg = self.config_loader.load_configuration( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 141, in load_configuration return self._load_configuration_impl( File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 262, in _load_configuration_impl ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg) File "/home/jilt/anaconda3/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 378, in _apply_overrides_to_config OmegaConf.update(cfg, key, value, merge=True) File "/home/jilt/anaconda3/lib/python3.8/site-packages/omegaconf/omegaconf.py", line 730, in update assert isinstance( AssertionError: Unexpected type for root: NoneType 2022-02-01 12:48:59 | INFO | wandb.sdk.internal.internal | Internal process exited

Thanks for the help in advance!

Jiltseb commented 2 years ago

@alexeib Could you please let me know if there is any update on this?

alexeib commented 2 years ago

hey, i am not able to reproduce this error. could you maybe rollback to an earlier version of omegaconf and hydra (see requirements.txt in fairseq root for versions) and try again?

ddoron9 commented 2 years ago

I guess this would be hydra and omega conf version diff. I got the same error while updating configs using the newest omegaconf, and hydra. downgrading those two packages worked for me.

Jiltseb commented 2 years ago

Hi, @alexeib Looks like it is indeed a problem with the hydra and omega conf version diff. I still need to figure out the way the dataset needs to be structured (even for just inference), but the error specified in this issue is solved. Thank you!