bbc / react-transcript-editor

A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
https://bbc.github.io/react-transcript-editor
Other
570 stars 164 forks source link

Digital Paper Edit adapter returns word timings as strings rather then numbers/float. #170

Open pietrop opened 5 years ago

pietrop commented 5 years ago

Describe the bug

Digital Paper Edit adapter returns word timings as strings rather then numbers/float.

To Reproduce Steps to reproduce the behavior:

  1. Go to demo app
  2. Click on load a speechmatics transcript
  3. add media from local media or url
  4. click on export transcript as digital-paper-edit
  5. See error

Expected behavior

Expecting start and end attribute to be numbers

  "words": [
    {
      "text": "On",
      "start": 0.96,
      "end": 1.08,
      "id": 0
    },
    {
      "text": "one",
      "start": 1.08,
      "end": 1.23,
      "id": 1
    },
    {
      "text": "hand",
      "start": 1.23,
      "end": 1.44,
      "id": 2
    },
...

Instead they are strings

{
  "words": [
    {
      "id": 0,
      "start": "0.96",
      "end": "1.08",
      "text": "On"
    },
    {
      "id": 1,
      "start": "1.08",
      "end": "1.23",
      "text": "one"
    },
...

Additional context

NA

pietrop commented 5 years ago

seems like in speechmatics the start and end time attribute are strings.

...
{
    "start": "13.23",
    "end": "13.41",
    "confidence": "0.990",
    "word": "was",
    "punct": "was",
    "index": 1
},
...

So it might need to parse those as floats.

As well as not casting them back to string, eg here