Open clarahohohoho opened 3 years ago
Hi, I found out that in the write_records_tcd.py step, the function write_bmp_records()
function in the dataset_write.py script the variable append_aus
is defaulted to False. I have tried changing it to append_aus=True
however another issue surfaced, where ValueError: 'AU25_r' is not in list
. This is probably because in my csv files, the headers are only limited to ['frame,face_id,timestamp,confidence,success,pose_Tx,pose_Ty,pose_Tz,pose_Rx,pose_Ry,pose_Rz']
. Any idea how can I proceed? Thank you!
Hi @clarahohohoho
aus
stands for facial action units. We proposed to regress AUs from video representations jointly with the speech decoding task in order to overcome a learning issue of AV Align (audio encoder attends to video encoder) seen on a more challenging task than speaker-dependent TCD-TIMIT.
If you are running an experiment using the run_audiovisual.py
script, please note the following parameter: regress_aus=True
. When this flag is enabled, it is expected that the tfrecord file contains a sequence of action unit intensities, allowing the computation of the distance between these ground truth values and the network's prediction. You may set this flag to False, depending on the goals of your research.
To generate target values for the AU intensities, we used the OpenFace toolkit. The extract_faces.py
script is a wrapper that calls the OpenFace binaries from Python and generates the bmp and csv files in the format expected by the code in this repository. The Action Units are written to the csv by appending the -aus
flag, please see here the complete set of CLI arguments. I realise now that the -aus
flag is not used in the example pre-processing script, but regress_aus
is set to True
in the AV experiment launch script, so I'll correct this issue.
You may need to pre-process again the video files setting the -aus
flag in extract_faces.py
, then re-generate the tfrecords. For convenience, I stored a single set of tfrecord files appended with this metadata, and only enabled or disabled AUs at runtime.
I hope this helps, please let me know if there is something else to clarify.
Hi, I have an error of
KeyError: 'aus'
from the linenormed_aus = tf.clip_by_value(self._data.payload['aus'], 0.0, 3.0) / 3.0
in encoder.py. I preprocessed my data with extract_faces.py and write_records_tcd.py with the LRS3 data. I realized that my self._data.payload is an empty dictionary. Any idea how to solve this error? Or is there any other possible variable that I can replaceself._data.payload['aus']
with?Any help is appreciated, thank you in advance!