This is reasonably straight forward: use the http://bnc.phon.ox.ac.uk/filelist-textgrid.txt files (in the absence of other AudioBNC metadata) as a starting point for a series of commands that can enrich the bnc_xml files with links to audio data for our xslt transformation.
For example, the line in that index text file pointing to this TextGrid file:
This is reasonably straight forward: use the http://bnc.phon.ox.ac.uk/filelist-textgrid.txt files (in the absence of other AudioBNC metadata) as a starting point for a series of commands that can enrich the bnc_xml files with links to audio data for our xslt transformation.
For example, the line in that index text file pointing to this TextGrid file:
http://bnc.phon.ox.ac.uk/data/021A-C0897X0020XX-AAZZP0_002002_KBK_2.TextGrid
Tells us that the tape with the ID 002002 in the file KBK.xml is associated with the audio file: http://bnc.phon.ox.ac.uk/data/021A-C0897X0020XX-AAZZP0.wav
So it should be a trivial series of search/replace commands to turn http://bnc.phon.ox.ac.uk/filelist-textgrid.txt into a set of simple sed commands to modify the bnc_xml files.
This would work pretty straight forwardly:
KBK.xml contains one instance of a recording ID line:
A simple regular expression such as s/n="002002" date/n="02002" audio="021A-C0897X0020XX-AAZZP0" date/ would yield a new line:
Which would give us the ability to add an audio header (either to a local or remote file) in the .cha file.