twinbasic / lang-design

Language Design for twinBASIC
MIT License
11 stars 1 forks source link

File IO support - big file support #40

Open WaynePhillipsEA opened 2 years ago

WaynePhillipsEA commented 2 years ago

This issue is a split from the main File IO issue (#279), where we can track and discuss how best to offer big file (>4Gb) support throughout the File IO syntax and accompanying library functions.

This feature is not yet implemented. We will look to implement this in Q1 2022.

As @Kr00l suggested in twinbasic/twinbasic#279, it would be nice for the file IO syntax and accompanying functions to support big files. Currently, file positions are considered 32-bit wide and that limits the maximum allowed file size supported by VBx and tB. There are several areas of the syntax that need to be tweaked, and of course several of the library functions will also need to be tweaked, or extended versions provided.

@Kr00l has suggested that new library functions should be offered, specifically for big-file support. e.g. FileLen64 / LOF64 / Loc64... etc.

wqweto commented 2 years ago

Does TB support native LongLong variables for x86 target (not wrapped in a Variant)?

I was wondering if it's possible for the original FileLen / LOF / etc. functions/statements to be repurposed to return Variant which will be I4 for <2GB positions and I8 for larger offset.

Coupled with implicit LongLong to Long conversion failing as in x64 VBA this seems like a reasonable compromise.

I mean this way one has two choices: work with LongLong variables for offsets if you want transparent large-files support from the runtime or keep it as currently implemented with Long offsets if you want to risk it failing on conversion I8 to I4 when file sizes/offsets exceed 2GB limit.

WaynePhillipsEA commented 2 years ago

Does TB support native LongLong variables for x86 target (not wrapped in a Variant)?

Yes, full native support on both architectures.

I was wondering if it's possible for the original FileLen / LOF / etc. functions/statements to be repurposed to return Variant which will be I4 for <2GB positions and I8 for larger offset.

Coupled with implicit LongLong to Long conversion failing as in x64 VBA this seems like a reasonable compromise.

I mean this way one has two choices: work with LongLong variables for offsets if you want transparent large-files support from the runtime or keep it as currently implemented with Long offsets if you want to risk it failing on conversion I8 to I4 when file sizes/offsets exceed 2GB limit.

Yes, that might work! It would certainly save us from polluting the global namespace with extra library functions. The performance cost to wrap in a variant should be negligible.

Kr00l commented 2 years ago

I don't like the Variant solution because that make code snippets not "safe" as these are depebdent on project settings etc. A code snippet with FileLen64 etc. would clearly fail then in VBA, as intended.

wqweto commented 2 years ago

I don't like the Variant solution because that make code snippets not "safe" as these are depebdent on project settings etc.

Long is 32-bit both in x86 and x64 targets so snippets using Long for file sizes/offsets will fail not matter bitness of target.

A code snippet with FileLen64 etc. would clearly fail then in VBA, as intended.

True, but code snippet with FileLen will fail on large-size files always, no matter the target bitness or the local variables used. I mean now we have VBx samples that always fail on large-size files so the situation cannot become worse.

Using Variants on retvals/params would keep the signature of functions/statements the same and would allow the runtime to cop out from runtime errors with large-size files by transfering responsibility to correctly dimension local variables for files sizes/offset to client code.

This way sample snippets can be easily corrected by changing local variables data-types, not requiring to change both local variables data-types and function/statement names by appending a 64.

Kr00l commented 2 years ago

I don't like the Variant solution because that make code snippets not "safe" as these are depebdent on project settings etc.

Long is 32-bit both in x86 and x64 targets so snippets using Long for file sizes/offsets will fail not matter bitness of target.

A code snippet with FileLen64 etc. would clearly fail then in VBA, as intended.

True, but code snippet with FileLen will fail on large-size files always, no matter the target bitness or the local variables used. I mean now we have VBx samples that always fail on large-size files so the situation cannot become worse.

Using Variants on retvals/params would keep the signature of functions/statements the same and would allow the runtime to cop out from runtime errors with large-size files by transfering responsibility to correctly dimension local variables for files sizes/offset to client code.

This way sample snippets can be easily corrected by changing local variables data-types, not requiring to change both local variables data-types and function/statement names by appending a 64.

I understand. It's just a "different failure" for unaware apps when size is too large. That could work.

mwolfe02 commented 2 years ago

The performance cost to wrap in a variant should be negligible.

If you wanted to give developers the option to avoid even that negligible cost, perhaps you could offer strongly typed alternatives using the data-type suffix characters for Long and LongLong similar to how the string functions in the standard library offer variant (Left()) and strongly-typed (Left$()) alternatives: