khenriks / mp3fs

FUSE-based transcoding filesystem from FLAC to MP3
http://khenriks.github.io/mp3fs/
GNU General Public License v3.0
382 stars 46 forks source link

Proposal for adding VBR encoding #31

Closed robertyseward closed 9 years ago

robertyseward commented 9 years ago

This is my proposal for adding VBR encoding. It takes khenriks' suggestion of using a pessimistic size for the file size a step further: it uses the pessimistic size (the size as if constant bit rate encoding was used) until the real file size is computed, and then caches this real file size and uses that size onwards. Always using a pessimistic size is no better than just using CBR, so is pretty useless for VBR.

With the pessimistic size, the ID3v1 tag is still at the end of the file, but with empty space between the end of the encoded data and the tag. This is necessary because MP3 readers always get the ID3v1 tag from the last 128 bytes of the file, regardless of what else is in the file.

The stat cache caches the file size, the modified time of the FLAC file, and the last time the cache entry was accessed. The modified time is used to detect when the FLAC file has changed (thus invalidating the cache entry). The access time is used to implement a least recently used cache policy, where the least recently used entries are removed when the cache gets above a threshold specified on the command line (--statcachesize). When the cache reaches the threshold, 10% of the entries are removed (instead of just one entry) because the cost of purging can be a little expensive.

One tricky thing about VBR encoding the the Xing section. This section has (among other things) the average encoding bitrate, which is used by most programs to compute the play time of the MP3. This section is part of the first frame, so it is basically at the beginning of the file. However, one doesn't know the average encoding bitrate until the entire file has been encoded. So, lame writes dummy values in this section when it is first written, and then a call is made to rewrite this section once the encoding is done. The implication of all this is that this implementation will encode the entire file for any read.

Because any read triggers an encoding of the entire file, performance is not that great in many real scenarios. There really needs to be a file cache (in parallel with the stats cache) to cache a small number of files (really, a cache with just one entry would solve most of the problems). But, this change was already big enough; the file cache can come later.

khenriks commented 9 years ago

Sorry for the delayed response.

Thanks for your contribution, and for your thoughtful explanation. In addition to all the line comments I made above, just in general I had a couple thoughts. I think the caching layer would be better to exist in a separate file, since it has a neat logical distinction, and if you could use a class to encompass the cache that would be a nice change too. I haven't looked too closely at the cache code yet, but I'm a bit worried that the locks are pretty coarse-grained, so I don't know if the code might be spending more time than it needs to waiting on the lock. But like I said, I haven't looked closely yet.

Thanks for your work!

robertyseward commented 9 years ago

Moving the caching layer into a separate file makes sense. I will do that, along with addressing your other comments.

I believe that just having a single lock on the stats cache is OK, as all the operations done when the lock is held are trivial, i.e. the code never holds the lock while doing transcoding or file operations.

That said, the file cache code (that you haven't seen yet) does hold locks when transcoding, so it has a lock per transcoder object. This makes things a bit more complicated, but is necessary for concurrent transcoding operations.

khenriks commented 9 years ago

One other comment that I was thinking but somehow managed to forget when typing my comments is that it looks like your stat cache size is defined in terms of an absolute size in kilobytes. I think it would simplify the logic in a few places if you defined it in terms of a fixed number of entries. The cached_entry_size function seems to rely on some implementation details, and using this other definition would let you avoid that.

robertyseward commented 9 years ago

I originally had the cache size in terms of entries, but thought that may be less helpful from a user perspective. But, since you are OK with it in terms of entries, I will make that change, as it does make things simpler.

robertyseward commented 9 years ago

I should add that the file cache (which is coming) probably should be in terms of bytes, as the amount of memory used can vary significantly with the file, and the user probably cares more about how much memory is being used.

khenriks commented 9 years ago

That's fine, since in the case of a file cache the size of the file is something we are explicitly responsible for.

You could include in the help text or the manual page for the stat cache a rough guess of the average size of one entry (100 bytes or so?) to give users some kind of idea of how much memory they're asking for.

robertyseward commented 9 years ago

OK, I moved the stats cache code into its own file, as well as addressed all your other comments. In addition, I realized that a bunch of stat() calls were being done while holding the lock, which seemed like a bad idea in terms of performance, so I redid that code a bit. Please take another look.

khenriks commented 9 years ago

This is a good contribution, and I've taken too long to review it because life has gotten in the way. Rather than worry more over it, I'm accepting it now and any issues can be ironed out as they occur.