I think 1000242_T.xml might be the only one we would want to address by changing the parsing code. We could in principle do this by changing the following line
implies that I found cases where Transcript provided a demarcation between the XML preamble and the presentation part of the call. So we'd probably have some ugly code that says "if there are Presentation and Transcript, but no Q&A, then assume that Transcript begins the Q&A". It might make sense to build up a sample of files with ====\nTranscript in them to be able to test this before implementing. Perhaps make a new issue for that, but don't do it just yet.
I think
1000242_T.xml
might be the only one we would want to address by changing the parsing code. We could in principle do this by changing the following linehttps://github.com/iangow/se_core/blob/0f4c5b73eeaa941e1405b2787c9854eb047108d5/import_speaker_data.R#L103
but the problem is that this line
https://github.com/iangow/se_core/blob/0f4c5b73eeaa941e1405b2787c9854eb047108d5/import_speaker_data.R#L91
implies that I found cases where
Transcript
provided a demarcation between the XML preamble and the presentation part of the call. So we'd probably have some ugly code that says "if there arePresentation
andTranscript
, but no Q&A, then assume thatTranscript
begins the Q&A". It might make sense to build up a sample of files with====\nTranscript
in them to be able to test this before implementing. Perhaps make a new issue for that, but don't do it just yet._Originally posted by @iangow in https://github.com/iangow/se_core/issues/15#issuecomment-619048982_