Open GoogleCodeExporter opened 9 years ago
Original comment by openvcd...@gmail.com
on 24 Oct 2008 at 12:27
Original comment by openvcd...@gmail.com
on 6 Aug 2010 at 10:56
This would be a very useful feature for our use. Some of our files are very
large, around 500 MB, and the device where we're decoding is memory
constrained. Being able to decode on an incremental basis without exhausting
memory is the way to go.
Original comment by john.b.s...@gmail.com
on 29 Oct 2010 at 4:14
The source file is less of a problem, assuming your target platform supports
memory mapped files. Passing a read-only mmapped region as the source
dictionary works well in practice. Streaming over a network is always going to
be difficult due to the random-access nature of the dictionary lookups. You'd
have to reengineer the diff format to be ordered by source address, not target
address to allow source streaming.
The bigger memory issue I've encountered (on iOS devices) is that the entire
decoded *target* file is kept in memory at once, making the streaming output
interface (via implementing OutputString) somewhat farcical. There doesn't seem
to be an interface for custom allocation of the target memory, which would
allow passing in a read-write memory mapped file. With such a setup, the OS
could evict the regions which aren't used for back-references within the target
file.
I'll look into patching the code to allow for such a mode of operation.
Original comment by phil%phi...@gtempaccount.com
on 29 Nov 2010 at 4:09
Very good news. We're not planning on streaming over a network. We'd get the
dictionary file to the device via a delivery channel, but would then want to be
able to apply the diff and read the source file in manageable chunks. And of
course, writing out the target file as it is created rather than keeping it all
in memory until the end would make this feasible on memory limited devices.
Thanks!
Original comment by john.b.s...@gmail.com
on 29 Nov 2010 at 5:41
> The bigger memory issue I've encountered (on iOS devices)
> is that the entire decoded *target* file is kept in memory
> at once, making the streaming output interface
> (via implementing OutputString) somewhat farcical.
> There doesn't seem to be an interface for custom allocation
> of the target memory, which would allow passing in a
> read-write memory mapped file. With such a setup,
> the OS could evict the regions which aren't used for
> back-references within the target file.
>
> I'll look into patching the code to allow for such a mode
> of operation.
Please take a look at issues 9 and 19. As long as you specify the option
--allow_vcd_target=false, or call VCStreamingDecoder::SetAllowVcdTarget(false),
only the current target window will be kept in memory, not the whole target
file. Using this option (perhaps in combination with --max_target_file_size)
will hopefully mitigate the problem you've described.
Your suggestion of using a memory-mapped file for output is intriguing. Issue
12 includes a related suggestion of passing a memory buffer to the decoder,
which should fill it and report whether the input has been exhausted or not.
Original comment by openvcd...@gmail.com
on 30 Nov 2010 at 12:46
I encountered the allow_vcd_target option while digging through the source to
diagnose this problem, but it wasn't clear to me whether it was safe to use.
Thanks for clearing this up.
I have however meanwhile hacked together a change to the decoder which allows
supplying your own target buffer manager instead of using std::string for
decoded_target_. This could be useful even where you can hold the whole file in
memory, as resizing the buffer can be implemented without a copy using mmap()ed
anonymous memory. (or indirectly via realloc()) Using std::string with the
default allocator also causes some severe memory fragmentation...
In my case, in iOS, I implement a buffer which mmap()s the output file, with
reserve() resizing the file with ftruncate() and remapping. The streaming
output is just a no-op dummy. This works well, although the API is very hacky
at the moment (passes all tests, however). I'll try to tidy it up and post a
patch.
In practice, I think I'll use both approaches, as I don't see any immediate
disadvantage to the mmap() solution, and VCD_TARGET acts as a hint that a
mapped region is no longer required.
Original comment by phil%phi...@gtempaccount.com
on 30 Nov 2010 at 11:07
Sounds great. I encourage you to contribute your changes if you think they
would be useful to others. Please take a look at the development guide:
http://code.google.com/p/open-vcdiff/wiki/OpenVcdiffDevGuide
Original comment by openvcd...@gmail.com
on 30 Nov 2010 at 5:02
Original issue reported on code.google.com by
lordje...@gmail.com
on 21 Oct 2008 at 10:48