tarsh / open-vcdiff

Automatically exported from code.google.com/p/open-vcdiff
Apache License 2.0
0 stars 0 forks source link

API should use void*, not char*, for binary data #4

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Several API methods take parameters that are pointers to arbitrary binary data. 
These parameters 
are typed as "const char*". Because of this, clients of the API that have data 
stored in other types, 
have to static_cast their pointers to pass them to vcdiff.

Instead, vcdiff's API should use "const void*" instead of "const char*". This 
permits any const 
pointer to be passed in without a type-cast, and also makes it clear to the 
human reader that the 
method takes arbitrary data, not just C strings. This is the usual convention 
in C/C++ APIs 
nowadays.

As a specific example, change vcencoder.h:70 from:
  HashedDictionary(const char* dictionary_contents,
                   size_t dictionary_size);
to:
  HashedDictionary(const void* dictionary_contents,
                   size_t dictionary_size);
(There are other instances of this at other points in the API.)

Original issue reported on code.google.com by Jens.Al...@gmail.com on 5 Sep 2008 at 8:28

GoogleCodeExporter commented 9 years ago
Hi Jens:

Thanks much for your interest in open-vcdiff, and for sharing your feedback.

I think the use of "const char*" versus "const void*" is debatable.

Passing a "const char*" for dictionary_contents specifies unambiguously the 
units in
which dictionary_size is expressed: the dictionary contains dictionary_size char
values (bytes.)  One of my colleagues pointed out that this has the advantage 
that a
pointer to the end of the dictionary can be found by calculating 
(dictionary_contents
+ dictionary_size); such a calculation would not work with a void* and would 
require
an explicit cast.

On the other hand, standard functions such as memcmp and memcpy take void* 
arguments
to represent pointers to contiguous bytes.  The existence of those interfaces
supports your position.

As another reference point from the compression milieu, zlib uses Bytef* for its
input and output, which is a typedef for "unsigned char*".

Would anyone else care to offer an opinion on this issue?

Saludos,

Lincoln Smith
Software Engineer
Google

Original comment by openvcd...@gmail.com on 11 Sep 2008 at 7:01

GoogleCodeExporter commented 9 years ago
- The type 'size_t' already unambiguously denotes a byte count.

- Your point about offsetting from the dictionary pointer relates to the 
library implementation; but the design of 
an API should give the convenience of _clients_ higher priority (unless it 
would severely penalize the 
implementation, of course.) So I don't think it's a good balance to require 
clients to cast the pointers, just to save 
a few casts in the implementation. [Also, FYI, gcc allows pointer arithmetic on 
void*s for exactly this reason.]

Original comment by Jens.Al...@gmail.com on 11 Sep 2008 at 8:44

GoogleCodeExporter commented 9 years ago
If the client wanted to do calculations they could store it as a const char* 
themselves. When they pass it to the library it will work fine as a const 
void*. Also remember const char* and const unsigned char* are not the same 
everywhere. Using const char* means everyone using the library for binary data 
will have to cast. That is inconvenient.

Also, send() and other socket methods use const void*. This would just be 
improving consistency with other network APIs.

Original comment by xoc...@gmail.com on 20 Jan 2011 at 4:28