technosaby / RedHenAudioTagger

MIT License
1 stars 4 forks source link

Using Metadata with ELAN #28

Open technosaby opened 2 years ago

technosaby commented 2 years ago

Importing this metadata into ELAN. That way, users could take a file, run the parser on it, import the file and the ssfx output into ELAN, and, voilà, would have in ELAN a metadata field for the file that would include all the sound effects. ELAN would also allow the user to hand-tag sound effects, and the .eaf file could then be imported into the Red Hen metadata for the file.

most people interesting in using this thing work in ELAN. Study the Red Hen page on ELAN at https://sites.google.com/case.edu/techne-public-site/elan

technosaby commented 2 years ago

@brucearctor @turnermarkb This means that from the model how I am generating the ssfx output, I will generate in another format which is .eaf(for ELAN) to import and see the tag in the tool? Or you want a script to convert the already generated ssfx output to eaf format? Or there is a way to import the ssfx output in ELAN tool such that the tags can be seen in the tool ?

brucearctor commented 2 years ago

I would not worry about ELAN, nor elan format/type. Perhaps that would be the structured data to use -- but don't think that needs to be the focus, as with any structured data can arrive at ways to transform.

EAF/XML is not known for its performance ( computational ) nor efficient data size on disk. @technosaby , @turnermarkb -- any idea where the sourcecode for ELAN exists [ I don't see on GitHub, nor linked from https://archive.mpi.nl/tla/elan ] -- and ideally something like a 'Contributing' doc for how the community operates, whether they accept contributions to the Open Source Project. I'd be happy to explore the codebase and see whether can extend the ELAN software to parse alternate files, and can otherwise make suggestions for how we can handle transforms from one structured data type to another.

Ultimately, any of these output [ EAF/XML, SSFX, CSV, ... ] are 'just' structured datasets, and basic data engineering will allow for transform to other types -- it seems MUCH more important to think about which attributes are desired to be structured output [ no matter the form ]. As I understand the intent and interest, it'd be wise to arrive at the structured data that is valuable to materialize to disk. What all is easily accomplishable by using the off-the-shelf model? How does that change if @technosaby starts to retrain via his interest in Transfer Learning?

technosaby commented 2 years ago

@brucearctor It will be great if you can join the call at 2:00 PM EST so that we can have a sync up together.

turnermarkb commented 2 years ago

ELAN is not central. But you should be aware of it because it is the dominant app worldwide in research for tagging of experiment clips. If you would come up with a way to process a video through your pipeline and have the metadata import to ELAN, people would find it useful. m

On Jul 11, 2022, at 1:01 AM, brucearctor @.***> wrote:

I would not worry about ELAN, nor elan format/type. Perhaps that would be the structured data to use -- but don't think that needs to be the focus, as with any structured data can arrive at ways to transform.

EAF/XML is not known for its performance ( computational ) nor efficient data size on disk. @technosaby https://github.com/technosaby , @turnermarkb https://github.com/turnermarkb -- any idea where the sourcecode for ELAN exists [ I don't see on GitHub, nor linked from https://archive.mpi.nl/tla/elan ] -- and ideally something like a 'Contributing' doc for how the community operates, whether they accept contributions to the Open Source Project. I'd be happy to explore the codebase and see whether can extend the ELAN software to parse alternate files, and can otherwise make suggestions for how we can handle transforms from one structured data type to another.

Ultimately, any of these output [ EAF/XML, SSFX, CSV, ... ] are 'just' structured datasets, and basic data engineering will allow for transform to other types -- it seems MUCH more important to think about which attributes are desired to be structured output [ no matter the form ]. As I understand the intent and interest, it'd be wise to arrive at the structured data that is valuable to materialize to disk. What all is easily accomplishable by using the off-the-shelf model? How does that change if @technosaby https://github.com/technosaby starts to retrain via his interest in Transfer Learning?

— Reply to this email directly, view it on GitHub https://github.com/technosaby/gsoc2022/issues/28#issuecomment-1179968109, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTVVWLB6JPP4ZM4JG3MMVLVTOTBVANCNFSM52QMOH7Q. You are receiving this because you were mentioned.

technosaby commented 2 years ago

@turnermarkb @brucearctor I am trying to write the eaf file by hand with the audio tags embedded, such that it can be ported in the ELAN tool, but I am facing some struggle here. I came up something like this (below) but it is not getting imported by the tool.

?xml version="1.0" encoding="UTF-8"?>

urn:nl-mpi-tools-elan-eaf:ba2bd17b-0d9d-4bcf-9208-b74a240d3632 83
... ... SPEECH RE

Is it possible for you to write a sample EAF with 1 audio tag (like speech) which can be easily ported? Then I can understand that and generate SFX -> EAF.

brucearctor commented 2 years ago

The first character likely is to be an opening bracket: '<', and not a question mark '?'. Though, there are easily other issues, and that one might be just formatting.

Somewhere @turnermarkb shared an example, of a file he manually annotated -- that'd probably work as an example, and then swap out the annotations he made for the 'sound effects' tags.

technosaby commented 2 years ago

So I found a way.. I exported all the Audio tags as a CSV and imported it in ELAN and the output looks like this... @brucearctor @turnermarkb

image

brucearctor commented 2 years ago

Cool!

A couple thoughts:

1) Importing from CSV:

I wonder whether using ELAN's import from csv is sufficient ( @turnermarkb is better positioned to know users workflows ), rather than EAF, though this might be totally fine [ I cringe anytime I hear CSV these days ]. Otherwise, it would be possible to use the ELAN source to modify to create something that can take sfx -> eaf.

2) Usability:

There look to be potentially ways to make more usable. Namely, condensing timeframes to longer blocks.

I also wonder whether you'd want distinct channels ( ex: rows in this view ) for each type of 'audio tag', and then researchers would potentially ONLY look for the rows/tags that interest them, and could delete/ignore the rest.

Naturally, that'd be more 'data engineering' work ( ex: ETL ) -- the tags already exist, so a matter of getting them to the form for easy user consumption.