Open jblom opened 2 years ago
Update
Now the ASR worker will store the following information in each DANE Result:
asr_processing_time: float # retrieved via submit_asr_job()
download_time: float # retrieved via dane-beng-download-worker or download_content()
kaldi_nl_version: str = "Kaldi-NL v0.4.1" # default for now
kaldi_nl_git_url: str = (
"https://github.com/opensource-spraakherkenning-nl/Kaldi_NL" # default for now
)
The code has not been tested in a real workflow yet, but has been merged already.
To prepare for a full provenance chain, it's good to start adding easy-to-obtain prov and timing information to the DANE Results of the ASR worker and the download worker.
Next to the desired provenance model (for informing e.g. researchers) it is very useful to store this information for more precise debugging of the DANE ASR workflow