AlibabaResearch / DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
MIT License
1.1k stars 178 forks source link

spectra processed dataset #61

Closed loutasr closed 5 months ago

loutasr commented 12 months ago

I am highly interested in the spectra project and would like to conduct experiments and testing using some preprocessed data. Currently, I couldn’t find any preprocessed data readily available for use within the project. Therefore, I kindly request your assistance in providing some preprocessed data that I can use to better understand and utilize the spectra project. Thanks.

publicstaticvo commented 12 months ago

Our pre-trained dataset is from Spotify 100K. You can follow the instruction in this link to apply one. Soon we will update how to preprocess the pre-training data on README.md file.

loutasr commented 12 months ago

Our pre-trained dataset is from Spotify 100K. You can follow the instruction in this link to apply one. Soon we will update how to preprocess the pre-training data on README.md file.

Alright,, thank you! I'll give it a try first and look forward to your updates.

huigeStudent commented 8 months ago

呜呜呜还没更新嘛

tnlin commented 8 months ago

cc @publicstaticvo Hi, sorry for the delay. Here are the processed features of downstream tasks. The SpokenWoz dataset will be released later. https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/iemocap.tgz https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/mintrec.tgz https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/mosei.tgz https://space-mm-data.oss-cn-wulanchabu.aliyuncs.com/downstreamv2/mosi.tgz

To access the training, validation, and test files in the datasets, you can use the following command to extract the mosi.tgz file:

tar -xzvf mosi.tgz

Once extracted, you'll find .pkl files for training, validation, and testing. Each pickle file contains a list of samples, and each sample includes the following components:

  1. Audio Features: This field contains the audio feature data.
  2. Text Token IDs: Here, you'll find the IDs corresponding to text tokens.
  3. Label: This is the label assigned to the sample.
  4. History Audio Features (if applicable): If present, this field contains historical audio feature data.
  5. History Text Token IDs (if applicable): Similar to the above, this includes historical text token IDs, if available.

We hope this information helps you in utilizing the dataset effectively. Should you have any questions or need further assistance, please feel free to reach out.