Closed branewave closed 5 years ago
Thanks for reporting this. And thanks for proposing a fix. I worked with Tuan (an outstanding GSoC student at NUS in Singapore) on RemoteIo in 2013. I seem to recall discussing this line of code with him!
I will investigate this for 0.27.3 which is scheduled for release on 2019-09-30.
Typically blockSize == 1024
and idx_ is the index of the stream. The result is
return data[p_->idx_++ - expectedBlock*p_->blockSize_];
When the index is 1023, we need block [0]
.
data[1023];
When the index is 1024, we need block [1]
.
data[1024 - 1*1024]; == data[0]
I modified blockSize_ to be 1 and tried simple tests such as:
$ bin/exiv2 -pa http://clanmills.com/Stonehenge.jpg
Oddly, everything seems to be working with/without your fix! (with http/https/curl/http.cpp). I think the reason is because RemoteIo::getb()
isn't used to parse images. I believe the code in image->readMetadata()
uses RemoteIo::read()
. I am currently writing a book Image Metadata and Exiv2 Architecture. I'll visit this again when I write the chapter: 7 I/O in Exiv2
which will discuss BasicIo in detail.
The suggested fix seems sound to me and I have submitted PR #1007 for team review and approval.
Very pleased to confirm that getb() is correct with this change. I've written a new IO test program which will be published in the book. The essence of this is a copy function which reads from http to file as follows:
static int cp (int argc,const char** argv,const Options& options)
{
// printf("cp argc = %d ",argc); for ( int i = 0 ; i < argc ; i++ ) printf("%s ", argv[i]); printf("\n");
int rc = 0 ;
Exiv2::BasicIo::AutoPtr pIn = Exiv2::ImageFactory::createIo(argv[0],true);
if ( rc == 0 && pIn->open() != 0 ) {
fprintf(stderr,"unable to open input file %s\n",argv[0]);
rc = 1 ;
}
Exiv2::FileIo out(argv[1]);
if ( rc == 0 && out.open("w+b") != 0 ) {
fprintf(stderr,"unable to open output file %s\n",argv[1]);
rc = 1;
}
if ( rc == 0 ) {
int count = 0;
char method[100];
strcpy(method,"unknown");
bool bEnd = pIn->eof() ;
if ( options.getb ) {
strcpy(method,"getb");
while ( !bEnd ) {
Exiv2::byte b = (Exiv2::byte) pIn->getb();
bEnd = pIn->eof();
if ( !bEnd ) {
out.putb(b);
count++;
}
}
} else {
strcpy(method,"read");
Exiv2::byte buffer[128 * 1024];
while (!bEnd) {
int n = pIn->read(buffer, sizeof(buffer));
out.write(buffer, n);
count += n;
bEnd = pIn->eof();
}
}
printf("copied %d bytes by %s method\n",count,method);
}
pIn->close();
return rc;
UNUSED(options);
}
The input options.getb
dictates whether the copy is performed byte-by-byte by getb()
, or in blocks using read()
. Both read the same number of bytes. With the change in PR #1007 the output and input files are now indentical.
When RemoteIo::getb is called for a byte at the very end of a block, it will populate the wrong block, and then read a byte from before the start of the "data" array. The fix is to remove the "+ 1" from the "expectedBlock" calculation.
size_t expectedBlock = (p_->idx_ + 1)/p_->blockSize_;