Open hagenw opened 2 months ago
I would go for this one:
Should we pre-define a list of file extension(s), that are then treated as text files?
It would be also great if we can support structured text as in a json
file. This is especially useful for dialog datasets, with metadata on turn level.
And how to return the content of a text file: Should it be a text string? Should it be a JSON string?
If it is a .txt
file, it should be a text string, if it is a .json
file, it should be a json string. This would imho be the both simplest and clearest solution.
Thanks for the feedback, sounds indeed like a good solution.
In
audb
1.7.0 we added support to publish not only audio and video files, but every file format a user would like to publish. This means we should also adjustprocess_index()
,process_file()
,process_files()
,process_folder()
to support other files.The question is how to best support text files:
try
andexcept
statements (could be tricky as audio files might also fail foraudiofile
ifffmpeg
is not installed)And how to return the content of a text file:
/cc @maxschmitt