Closed Rondom closed 7 years ago
I'd put this on the backburner because you'd fix an issue that will probably never hit.
Instead, I'd focus on defining and stabilizing the 2.0.0 API, and it would be perfectly fine for me to have all 2.y.z without 2-GB-module support.
But I can really appreciate this desire to fix obscure corner cases. This would likely get merged if you made a thoroughly-tested PR.
Indeed it is quite unlikely. But at least it should not crash. I have not checked whether it does, but right now the -1
that the ftell
will probably return is saved into the filesize
.
I am not sure where it is used, but it might create problems.
Maybe I will create a PR that -- as a first step -- extends the API, but still only supports 2GB files for the file-based-backend, i.e. will cleanly error. Otherwise we would need yet another ABI-bump (or write wrapper code for compatibility) to support the function-pointers with a different signature.
Sounds good to me -- have this in the 2.0 API where breakage okay, and treat small modules as before. Then add full support later.
Fixing the ftell so it handles errors is sort of important, but many things in the new interface will automatically fail if the position is greater than the size, and with a negative size, this is always the case. At least I think I made that happen. Not totally sure without actually looking at it again.
@Rondom, could you propose the new API within 2-3 days? This seems to be the final change to the 2.0 API.
I'd love to adapt my A5 pull request to any changes here, then get it merged. The A5 team plan a release in early-mid October, otherwise music support would wait another 3 months.
I don't mind changing certain APIs and variables to the declared LONG_LONG macro type. I just need to change all of the relevant loaders and parsers to use that variable type to receive file offsets.
Just note that we may only need to support rejecting files with offsets that large. It is still possible to use the current API to create a 32-bit or smaller window within a large file, and keep your own private base file offset for seeking operations, as well as a private size for the file size operation.
Note that quite a number of the formats are limited to 32 bit addressing anyway. At least Impulse Tracker modules contain 32 bit file offsets to most of the structures. And the three RIFF based formats are limited to RIFF 32 bit, and have never been seen in RF64 format.
I'm fine with either strategy:
The publicly visible typedef LONG_LONG
is not prefixed with DUMB_
. Should it be renamed to DUMB_OFF_T
? I don't have enough experience across many different C projects to judge whether there are other meaningful definitions than ours. We guard it with #ifndef
, but then it still depends on header order. Expert advice welcome.
Do we even need a macro anymore? C99 defines long long
in the C spec as a 64-bit-or-longer signed integer.
If you still prefer the 64-bit API, here's my take:
#define DUMB_OFF_T long long
// assumes C99, that's OK. off_t
would be signed, too
// remove the define for LONG_LONG
or make it internal
typedef struct DUMBFILE_SYSTEM
{
void *(*open)(const char *filename);
int (*skip)(void *f, DUMB_OFF_T n);
int (*getc)(void *f);
DUMB_OFF_T (*getnc)(char *ptr, DUMB_OFF_T n, void *f);
void (*close)(void *f);
int (*seek)(void *f, DUMB_OFF_T n);
DUMB_OFF_T (*get_size)(void *f);
}
DUMBFILE_SYSTEM;
DUMB_OFF_T dumbfile_pos(DUMBFILE *f);
int dumbfile_skip(DUMBFILE *f, DUMB_OFF_T n);
int dumbfile_seek(DUMBFILE *f, DUMB_OFF_T n, int origin);
DUMB_OFF_T dumbfile_get_size(DUMBFILE *f);
DUMB_OFF_T dumbfile_getnc(char *ptr, DUMB_OFF_T n, DUMBFILE *f);
DUMBFILE *dumbfile_open_memory(const char *data, DUMB_OFF_T size);
DUH *make_duh(DUMB_OFF_T length, /* ... */);
DUMB_OFF_T duh_get_length(DUH *duh);
void duh_set_length(DUH *duh, DUMB_OFF_T length);
I agree with everything you said. Defining LONG_LONG bothered me as well. I might have a look tonight or tomorrow.
@kode54: any comments on my suggested API? That's a release candidate of the 2.0 API, please scrutinize: Did I miss functions that need 64-bit offsets? Did I put 64-bit arguments in functions that don't need them?
If you're OK with it, and unless @Rondom has something within 12 hours, I'll implement the above API tomorrow, and make a PR in about 24 hours.
I'll watch the issues and PRs in case anything happens in-between, and I'll also sit in IRC for chat or questions: irc.freenode.net #allegro, or irc.quakenet.org #lix.
I posted a PR #59. Your suggested changes are fine except that getnc should not return and take off_t, but rather size_t. On a 32-bit system sizeof(off_t)>sizeof(size_t)
Please review carefully, because I agree with you, that it is easy to miss things or change too much.
Thanks for the work, most of it looked very good already. PR #60 is a smaller change for the remainder.
@kode54: What is a DUH
, exactly? How exactly is its length
used throughout the code? It's the length of what? Should DUH.length
be:
size_t
(length in memory), probably not because DUMB calls it with -1,dumb_ssize_t
(length in memory or error code), ordumb_off_t
(length inside a file on disk, negatives possible, e.g., to indicate error)?PR #60 keeps it as dumb_off_t
. My gut instinct is that's correct, it probably describes stuff on disk. But I'm really unsure. While DUH
is private, make_duh
is public and takes a parameter of length
's type. We should get this right. :-)
Closed by PR #59 and PR #60
I know that 2GB-MOD-files are very unlikely to occur, but I am wondering whether it makes sense to support files >2GB.
I propose the following
-D_FILE_OFFSET_BITS=64
during build andftello
returns a 64-bit-type on 32-bit Unix-platformsDUMBFILE_SYSTEM
to takeDUMB_OFF_T
which resolves to a 64-bit-type (can be checked at compile-time)What do others think?