parser fails to parse subtitle files if data isn't coming in fast enough

mathiasvr / matroska-subtitles

💬 Streaming parser for embedded .mkv subtitles.

MIT License

45 stars 10 forks source link

parser fails to parse subtitle files if data isn't coming in fast enough #11

Closed ThaUnknown closed 3 years ago

ThaUnknown commented 3 years ago

The parser fails to fully parse embedded files if the data isn't coming in, this happens consistently on files which have insane amounts of embedded files ex: btih:d6e48a098366f322ee82393b80045dee4b25bcd9 which is expected to have 27 files, but often less than 5 are parsed I have used webtorrent for this, is it possible it's a fault with their implementation of readable-stream? code:

    const parser = new SubtitleParser()
    this.handleSubtitleParser(parser)
    parser.once('tracks', tracks => {
      if (!tracks.length) {
        parser.destroy()
        fileParserStream.destroy()
      }
    })
    parser.on('subtitle', () => {
      parser.destroy()
      fileParserStream.destroy()
    })
    const fileParserStream = file.createReadStream()
    fileParserStream.pipe(parser)

mathiasvr commented 3 years ago

I haven't been able to reproduce this, but probably haven't been able to replicate the "data not coming fast enough" criteria.

In your code snippet you don't have the file event handler which is actually outputting the embedded files?

ThaUnknown commented 3 years ago

I haven't been able to reproduce this, but probably haven't been able to replicate the "data not coming fast enough" criteria.

In your code snippet you don't have the file event handler which is actually outputting the embedded files?

this seems to be partially a WebTorrent issue, where the parser tries to pull data which wasn't yet downloaded, and once it has enough data it seems to "forget" the fact that it was parsing the file?

my file event handler is as follows:

      parser.on('file', file => {
        if (file.mimetype === 'application/x-truetype-font' || file.mimetype === 'application/font-woff' || file.mimetype === 'application/vnd.ms-opentype') {
          this.subtitleData.fonts.push(URL.createObjectURL(new Blob([file.data], { type: file.mimetype })))
        }
      })

mathiasvr commented 3 years ago

Might be related to loosing state when opening a new instance of the stream parser. You're still doing random access right? Or does this happen with a single parser instance as in your snippet?

ThaUnknown commented 3 years ago

Might be related to loosing state when opening a new instance of the stream parser. You're still doing random access right? Or does this happen with a single parser instance as in your snippet?

single parser instance, in the 1st snippet as you can see, I create a parser not a stream, which is destroyed when it encounters the first subtitle, or if there are no sub tracks embedded in the video, embedded files are before any video data, this was fix to that instancing thing on multiple streams

this was mostly a test case, in my player I now use a SubtitleStream instead of parser, that starts at 0, waits for tracks, then gets cloned on seeking, but the end result is the same

ThaUnknown commented 3 years ago

I'm sorry, seems like I fucked up somewhere on my initial testing and reproducing of this issue, I think I instead of using parser I used stream which stops after it parses metadata, or I destroyed the stream prematurely, anyways, this issue is invalid because of my stupidity XD

I fixed this by creating a parser that gets cloned by a stream after it parses the first subtitle