HomeBankCode / rlena

R package for parsing LENA's .ITS files
GNU General Public License v2.0
6 stars 5 forks source link

some its files don't have timezones #7

Closed tjmahr closed 5 years ago

tjmahr commented 5 years ago

I got the following email. Some its files don't have a timezone in the expected spot.

I noticed that some .its files do not have timezone information in the UPL_header section, which throws an error when using the gather_recordings function.

For example, while most recordings show something like the following:


                <RecordingInformation>
                    <TransferTime LocalTime="2015-07-02T18:41:38" TimeZone="PST" UTCTime="2015-07-03T01:41:38" />
                    <Audio>
                        <Sampling DepthInBits="16" RateInHz="16000" Channels="1" />
                        <Coding Name="u-Law ADPCM" />
                        <TimeZone ZoneNameShort="GMT-08:00" StandardSecondsOffset="-28800" UsesDST="1" />
                    </Audio>
                </RecordingInformation>

some show only:

                <RecordingInformation>
                    <TransferTime LocalTime="2010-04-09T14:40:32" TimeZone="EST" UTCTime="2010-04-09T18:40:32" />
                    <Audio>
                        <Sampling DepthInBits="16" RateInHz="16000" Channels="1" />
                        <Coding Name="u-Law ADPCM" />
                    </Audio>
                </RecordingInformation>

It seems like it should be easy to simply read the files, search for a line that contains "TimeZone," and add one with appropriate information if it's missing, but I haven't worked at all with its files and am struggling with how to do this. Do you have any advice about how I could do this, or get around the error otherwise? For reference, the Cougar data from the Homebank corpus all have this issue in common.