SvarDOS / edrdos

Enhanced DR-DOS kernel and command interpreter ported to JWasm and OpenWatcom C
http://svardos.org/
Other
26 stars 3 forks source link

Single-file kernel load from ecm repo? #28

Open ecm-pushbx opened 7 months ago

ecm-pushbx commented 7 months ago

Hi, I noticed your recent commits don't include the patches to use single-file load via lDOS iniload / drload + drkernpl. In case you missed it, I commented this some on https://pushbx.org/ecm/dokuwiki/blog/pushbx/2024/0107_enhanced_dr-dos_single-file_load

Did you intentionally not include these patches? Do you have additional questions or is there something else I can do to help you?

boeckmann commented 7 months ago

Hi! I intentionally delayed the inclusion of the single-file patches, because

  1. I wanted to release at least one more-or-less working dual-file package for SvarDOS before importing it, and
  2. wanted to play around with lDebug boot capabilities before importing it (I am currently studying the extensive lDebug manual :-) )

When I import the changes for single-file load, do I have to provide a new SYS command with the SvarDOS package? I also noticed you opened a ticket regarding the boot protocol at https://github.com/FDOS/kernel/issues/119. Next, I am not sure if I should LZMA compress the kernel for SvarDOS, because you mentioned some 30 second decompression time. But I have to test this and weight the options against each other. If I use it uncompressed, I should at least test if this zero-compression would decrease the file size, so implementing an open source version of compbios/compbdos, probably as a single tool, would be high on the priority list. All in all I have to invest some time to understand how this all fits together.

I am also somewhat distracted by doing some work on command.com, fixing annoyances as I encounter them. But I think I am nearly through with this.

ecm-pushbx commented 7 months ago

Hi! I intentionally delayed the inclusion of the single-file patches, because

  1. I wanted to release at least one more-or-less working dual-file package for SvarDOS before importing it, and

Fair.

  1. wanted to play around with lDebug boot capabilities before importing it (I am currently studying the extensive lDebug manual :-) )

If you come across any questions not answered in the manual feel free to drop me a note ^^ (Here or in a mail or on the forum or the blog.)

When I import the changes for single-file load, do I have to provide a new SYS command with the SvarDOS package?

All EDR-DOS / FreeDOS SYS programs should work if you have the user pass /bootonly /k edrdos.com or one of the other filenames. It's your call if you want to build and host a new revision of FreeDOS SYS with a load protocol that defaults to one of the new names. (The current FreeDOS kernel sources contain most of the EDR-DOS patches in #ifdef conditionals I believe.)

I also noticed you opened a ticket regarding the boot protocol at FDOS/kernel#119.

Yes, but as mentioned this does not affect the drload or iniload stages as used by my EDR-DOS and lDebug. I will likely submit a patch to FreeDOS to just have the FAT32 loaders always pass both registers at a later time. This would also make it so you could compile EDR-DOS SYS from the shared sources with the existing #ifdef blocks without needing to patch the boot loaders to match the original EDR SYS's, yet work just like that to pass DL too.

Next, I am not sure if I should LZMA compress the kernel for SvarDOS, because you mentioned some 30 second decompression time. But I have to test this and weight the options against each other.

In that blog post I actually mentioned it can take several minutes, here:

However, depacking on a low-end machine (eg NEC V20) may take several minutes to complete.

This is based on the performance of lDebug release 5 tested on an 1 MiB model of the HP 95LX, powered by a "5.37 MHz NEC V20". (I don't think you can boot another DOS easily on that particular device, but it does provide some reference for low-end machines.) In the blog post comparing the depack speed I found that the LZMA-302eos depacker needed more than two minutes. This is with a depacked kernel stage of size 106 KiB, compressed to an image of 62 KiB for LZMA-302eos (not including iniload or depacker), and an image of 69 KiB for LZSA2 which is also mentioned in the performance comparison.

(If you want to change the example of my EDR-DOS mak.sh script to use LZSA2 compression instead, you would have to pass -D_LZSA2 to inicomp.asm instead and use the lzsa command line tool to compress, which is found in this repo. The lDebug mak.sh script is the canonical source of truth about inicomp use.)

I did improve things for the LZMA-302eos depacker a bit by enabling its _COUNTER define, that is, having it display a progress indicator (by default a dot is displayed at every 128th step). About four lines of 80 columns each are filled by the 2024 January revision when the depacker runs to completion.

If I use it uncompressed, I should at least test if this zero-compression would decrease the file size, so implementing an open source version of compbios/compbdos, probably as a single tool, would be high on the priority list.

It likely would decrease the size. However, you would have to build the (BIO module) kernel twice if you want to use it with and without the compression to build different versions of the kernel file.

All in all I have to invest some time to understand how this all fits together.

Fair enough, just wanted to check in.

I am also somewhat distracted by doing some work on command.com, fixing annoyances as I encounter them. But I think I am nearly through with this.

Something I would like to see in the shell is SET /E variable=command support modelled on FreeCOM's. Basically it internally runs the command, redirecting its stdout like command > file would, then grabbing the first line from the temporary file to use as the content of the variable. I would probably be able to do it on my own but I would also welcome it if you worked on this.

ecm-pushbx commented 7 months ago

SET /E issue created in response to my comment: https://github.com/SvarDOS/edrdos/issues/30

boeckmann commented 6 months ago

Because this compression topic came up at the SvarDOS forum. I tested heatshrink compression of DRBIO.SYS and DRDOS.SYS and got following results (parameter -w 8).

DRBIO.SYS: 22K (original 36K, partially zero-compressed) DRDOS.SYS: 33K (original 37K, mostly zero-compressed)

I then tested the exoraw [1] compression and got the following results:

DRBIO.SYS: 18K DRDOS.SYS: 27K

Quite a difference to heatshrink and perfectly acceptable size in my opinion. But have to test the decompression speed...

[1] https://bitbucket.org/magli143/exomizer/src/master/

ecm-pushbx commented 6 months ago

Can you show the exact commands and files that you used? Do your sizes include depackers for these formats?

boeckmann commented 6 months ago

The exoraw command was (DRBIO.SYS):

exoraw DRBIO.SYS -o DRBIO.TST

Heatshrink:

heatshrink -w 8 DRBIO.SYS DRBIO.TST

The depacker size is not included. This was a first test to see how compression generally performs.

boeckmann commented 6 months ago

What is interesting is that exoraw performs much better on actual X86 code. While heatshrink does a good job of getting rid of the remaining zero-filled areas in DRBIO.SYS, compression of DRDOS.SYS is quite "bad".

boeckmann commented 6 months ago

The exomizer version I use is from commit https://bitbucket.org/magli143/exomizer/commits/df77c879ce2addc027043e5e80e3992a2ec99eb9. Version 3.1.1.

ecm-pushbx commented 6 months ago

I do have support for both exomizer and heatshrink in inicomp already. Also apl/apultra and of course the lzip I already used in my EDR-DOS kernel build. I'm using exomizer-3.0.2.zip, also from that bitbucket repo. Here's some numbers gathered from building the current tip of lDebug, the non-debuggable non-DPMI triple-mode executable ldebug.com: (The sizes are for the entire executable, including lDOS boot iniload and the inicomp depacker stage.)

ldebug/source$ INICOMP_SPEED_TEST=128 use_build_decomp_test=1 INICOMP_METHOD="lzd exodecr apl heatshrink lzsa2" ./mak.sh
[...]
ldebug/source$ LC_ALL=C sort ../tmp/debug.siz
   79872 bytes ( 60.70%), method              lzd
   83968 bytes ( 63.81%), method              apl
   84480 bytes ( 64.20%), method          exodecr
   86528 bytes ( 65.75%), method            lzsa2
   99840 bytes ( 75.87%), method       heatshrink
  131584 bytes (100.00%), method             none
ldebug/source$ LC_ALL=C sort ../tmp/debug.spd
    1.46s for 128 runs (   11ms / run), method            lzsa2
    3.00s for 128 runs (   23ms / run), method              apl
    3.84s for 128 runs (   30ms / run), method          exodecr
    5.00s for 128 runs (   39ms / run), method       heatshrink
    9.56s for 128 runs (   74ms / run), method              lzd
ldebug/source$ hg id
78e49c57a05f tip

The speed test uses dosemu2 with KVM on an amd64 Debian system. Here's the depack stage size displayed for each of those:

../../inicomp/lzd.asm:698: warning: localvariables has 14688 bytes [-w+user]
../../inicomp/inicomp.asm:1323: warning: inilz: 3024 bytes used for depacker [-w+user]
../../inicomp/inicomp.asm:1323: warning: iniexo: 1472 bytes used for depacker [-w+user]
../../inicomp/inicomp.asm:1323: warning: iniapl: 1344 bytes used for depacker [-w+user]
../../inicomp/inicomp.asm:1323: warning: inihs: 1360 bytes used for depacker [-w+user]
../../inicomp/inicomp.asm:1323: warning: inilzsa2: 1424 bytes used for depacker [-w+user]

And here's the compression commands used for all of the supported formats: https://hg.pushbx.org/ecm/ldebug/file/78e49c57a05f/source/mak.sh#l530

ecm-pushbx commented 6 months ago

By the way, the best heatshrink option turns out to be -w 14 -l 4 for this lDebug build. The mak.sh script tests all valid combinations of the parameters.

boeckmann commented 6 months ago

By the way, the best heatshrink option turns out to be -w 14 -l 4 for this lDebug build. The mak.sh script tests all valid combinations of the parameters.

I made a test, comparing the sizes for DRDOS.SYS: 34279 bytes for -w 8 and 32930 bytes for -w 14 -l 4. So ~1.300 bytes improvement.

The exomizer changelog states that most depackers can not handle the new bitstream format for 3.0+.

So I also made a test calling exoraw -P 0 to make the bitstream compatible with older depackers. DRDOS.SYS file size: 28362 bytes. Up from 28087 bytes when using version 3.1 features. Not much worse than before. -P7 yields also a size of 28362 bytes. Indicating that control bit 5 has some effect. My exoraw uses -P 39 by default.

For the inicomp depacker I have to restrict to -P7? I see no tests like -P & 32 in https://hg.pushbx.org/ecm/inicomp/file/tip/exodecr.asm.

ecm-pushbx commented 6 months ago

The exomizer changelog states that most depackers can not handle the new bitstream format for 3.0+.

So I also made a test calling exoraw -P 0 to make the bitstream compatible with older depackers. DRDOS.SYS file size: 28362 bytes. Up from 28087 bytes when using version 3.1 features. Not much worse than before. -P7 yields also a size of 28362 bytes. Indicating that control bit 5 has some effect. My exoraw uses -P 39 by default.

For the inicomp depacker I have to restrict to -P7? I see no tests like -P & 8 in https://hg.pushbx.org/ecm/inicomp/file/tip/exodecr.asm.

Better link. Look again, I see conditionals for _P & x with x equal to 1, 2, 4, 8, and 16. So the P flag 32 seems to be new, I'd have to study the changes to find out whether the depacker needs an update. (Don't forget to pass -D_P= to the assembly command if you want to use a P other than 7.)

boeckmann commented 6 months ago

Yes, 32 corresponds to bit 5. No idea why I wrote 8 :-/

mateuszviste commented 6 months ago

While these heatshrink / exoraw options are very interesting, could you tell what is the reason that the DR kernel cannot be UPXed like how the FreeDOS folks do? UPX is quite efficient when it comes to compressing x86 code. I assume there is an obvious technical limitation but I'm just to dumb to understand it on my own, so please consider this to be a "please educate me" question :)

ecm-pushbx commented 6 months ago

I have worked a little with the FD kernel UPX support, for example creating an offline depacker and changing the format and placement of the CONFIG block and stubs.

However, I prefer working with my own depackers and inicomp stage, which offer more choices than UPX and also work with triple-mode executables (kernel, DOS device driver, DOS application). (Executables larger than 64 KiB are supported with the new DOS/EXE exeflat format from that PR of mine that was merged.) The usage conditions and licenses are clearer as well. (I have a reminder somewhere to complain on the freedos-devel list that the DOS UPX releases mirrored by FD use the nonfree NRV compression library. Not a big priority but not nice either.) (The usage conditions are a reason for my offline depacker as UPX states you should be able to unpack the executable, and not to modify their depackers.)

ecm-pushbx commented 6 months ago

Reviewing the new exeflat format, it also seems to depend on the kernel being loaded at segment 60h which is another hurdle. Probably not that difficult to change but I'd have to study it more.

mateuszviste commented 6 months ago

So mostly license-related reasons, thanks for clarifying. I was hoping for a quick win to start proposing an EDR-based SvarDOS "preview" release but I understand that whether it is UPX or a custom packer, in both cases some serious work must happen first, there's no free lunch option. I'm keeping my fingers crossed then :)

ecm-pushbx commented 6 months ago

but I understand that whether it is UPX or a custom packer, in both cases some serious work must happen first,

Well, for lDOS inicomp + lDOS drload/iniload very little work is needed, just pick my single-file kernel patches and (if other format than lzip desired) adapt the compression commands from lDebug's main mak.sh.

boeckmann commented 6 months ago

As a sidenote: I wanted to test how well exomizer compresses the FDISK executable in comparison to UPX, and here are the results:

exoraw: 38321 bytes, 71994 uncompressed UPX: 39247 bytes, but this includes the decompression stub

so all-in-all on par, I would say. Within ~500 bytes difference considering decompressor size to be added for exeraw.

ecm-pushbx commented 6 months ago

I added support for Exomizer's new -P 32 flag in https://hg.pushbx.org/ecm/inicomp/rev/98918a411ab5

I also added some auto-detection to pass the correct -D_P= switch to NASM, instead of defaulting to 7 when no INICOMP_EXOMIZER_P variable is given: https://hg.pushbx.org/ecm/ldebug/rev/bcaea57de05c

Doubling the -P switch is needed to work around a bug in older Exomizer versions: https://hg.pushbx.org/ecm/ldebug/rev/3786ef6709ef

boeckmann commented 5 months ago

I yesterday had a look at the exomizer source code. While the decoding algorithm is fairly straightforward, I am struggling to reverse engineer how the compressor actually works. I have not found an algorithmic description, and the source is basically without comments. Thats a little sad, because I really would like to know how this thing works.

Apart from that, digging a little deeper into this kernel-compression topic I am starting to wonder if using an alternative algorithm like LZSA2 would be a better fit for the single-file compressed version of the EDR kernel, because according to https://github.com/emmanuel-marty/lzsa, decompression should be significally faster. And ECMs measurements above indicate a decompression speedup of 2.5 times for LZSA2 in comparison to exomizer (though it is not measured on an 8088), buyed by moderate increase in file size.

ecm-pushbx commented 5 months ago

https://pushbx.org/ecm/dokuwiki/blog/pushbx/2024/0319_early_mid_march_work

Further, I have considered a new repo which will be based on the mak script of lDebug and will provide the scripting needed to wrap and optionally compress the Enhanced DR-DOS single-file kernel or the FreeDOS kernel. This will also eventually lead to CONFIG block support in inicomp and fdkernpl (part of ldosboot).

I didn't get around to this yet. The lDebug mak script allows to build several compression formats in the same run, so users could choose either apl, Exomizer, LZMA-lzip, or faster formats like LZSA2.

ecm-pushbx commented 3 months ago

Current EDR-DOS builds from me use the new kernwrap script which is derived from lDebug's mak script. So in the tmp/ subdirectories users may choose alternatives of the compressed edrpack.sys (drload) or edrpack.com (iniload) files, including the very fast LZSA2, or LZ4, or heatshrink, or the also strong apl or exomizer.

ecm-pushbx commented 3 months ago

I yesterday had a look at the exomizer source code. While the decoding algorithm is fairly straightforward, I am struggling to reverse engineer how the compressor actually works. I have not found an algorithmic description, and the source is basically without comments. Thats a little sad, because I really would like to know how this thing works.

I don't know much about any of the compression methods I use, they all just had usable enough depacker code to depack for me to port it or detailed descriptions of the resulting format so that I could implement a depacker on my own.

This difference is what determines my licensing as stated in my lDebug manual. So the depackers for LZ4, Snappy, Heatshrink, and LZO were written by me. The remaining ones I ported, that is BriefLZ, Exomizer, X, Lzd (lzip), LZSA2, apl, bzpack.

I haven't had any good ideas for compressing data on my own. I did experiment some with "bit runs" which was basically a way to record how many consecutive bits of the same value occur. This turned out to be a great way to expand the size of a file.

The other idea I've had is to use genetic programming to obtain an algorithm/program that reconstructs the original data. Also null useful results so far.

At one point I tried out a neural net based compression method someone else wrote (unfortunately don't recall the name) but this had high memory consumption and was very slow. And it was symmetric in time and space, ie the depacker was as slow as the packer.

boeckmann commented 3 months ago

@ecm-pushbx I will try to create a new EDR package for SvarDOS in the next days, using your current single-file kernel. If I understand correct, the edrdos.sys and edrpack.sys would work with the FreeDOS SYS provided loader in DRDOS boot protocol mode? Would it be much effort for you to provide me with a LZSA2 version of the single file? Otherwise I would have to setup a Linux VM with an appropriate build environment (DOSEMU etc.).

ecm-pushbx commented 3 months ago

@ecm-pushbx I will try to create a new EDR package for SvarDOS in the next days, using your current single-file kernel. If I understand correct, the edrdos.sys and edrpack.sys would work with the FreeDOS SYS provided loader in DRDOS boot protocol mode?

All files (edrdos/edrpack).(sys/com) work with FreeDOS loaders as replacement for kernel.sys or the original EDR-DOS loaders (based on FreeDOS's) as replacement for drbio.sys. The .com files also work as MS-DOS, IBM-DOS, Multiboot, lDOS / RxDOS kernels. To your specific question, yes, both of the .sys files will work with EDR SYS's default loader. (But SYS may expect the drdos.sys file to exist or be copied over as well. This is not needed for us.)

(The exception is that if you append arbitrary data such as a .zip file to the kernel then it will at some point (eg > 128 KiB) fail to work as FreeDOS/EDR-DOS load. So you shouldn't ever do that to the .sys files as they will be useless then. The .com files can still be loaded in one of the other load protocols even with data > 1 MiB.)

Would it be much effort for you to provide me with a LZSA2 version of the single file? Otherwise I would have to setup a Linux VM with an appropriate build environment (DOSEMU etc.).

It is already shipped in https://pushbx.org/ecm/download/edrdos.zip in the subdirectory tmp/sa2/ which has an edrpack.sys and edrpack.com

ecm-pushbx commented 3 months ago

Oops, forgot to submit the prior reply until now.

boeckmann commented 3 months ago

Cool, thanks. Actually I already downloaded the archive but did not expect the LZSA2 kernel in the tmp folder :)

boeckmann commented 3 months ago

I added a /OEM:EDRPACK option to FreeDOS SYS, which makes the loader search for EDRPACK.SYS. @ecm-pushbx is EDRPACK.SYS the final name for it? Kernel loads fine. I might create a pull request at FreeDOS for the addition.

I will remove the DRSYS.COM file from the SvarDOS kernel package, and instead would replace the SvarDOS SYS command with this one. Kernel then has to be installed via SYS /UPDATE /OEM:EDRPACK or SYS /UPDATE /OEM:EDR (for the dual-file version), with the default installing the FreeDOS kernel. But this would be ok I think. Should not overburden the ordinary user able to install SvarDOS :)

mateuszviste commented 3 months ago

This sounds awesome, thank you Bernd & ECM!

About the "SYS /UPDATE /OEM:xxxx" business - wouldn't it be possible for SYS to figure out which kernel it installs? For example: If C:\KERNEL.SYS is present -> FreeDOS mode If C:\EDRPACK.SYS is present -> EDR-DOS packed If C:\DRBIO.SYS etc

Of course it won't work if someone has different variants present in C:\, so an /OEM way to enforce a specific kernel would still be required, but such automatic detection would at least cover the most obvious scenario of a single-kernel system.

boeckmann commented 3 months ago

About the "SYS /UPDATE /OEM:xxxx" business - wouldn't it be possible for SYS to figure out which kernel it installs? For example: If C:\KERNEL.SYS is present -> FreeDOS mode If C:\EDRPACK.SYS is present -> EDR-DOS packed If C:\DRBIO.SYS etc

Of course it won't work if someone has different variants present in C:, so an /OEM way to enforce a specific kernel would still be required, but such automatic detection would at least cover the most obvious scenario of a single-kernel system.

Well this is another option that is supported "out-of-the-box". If you SYS from the directory that only contains the EDR kernel (the package directory) it will install it without the user being forced to use /OEM. Otherwise the search order is:

boeckmann commented 3 months ago

If there are multiple kernels in the root folder one can switch the kernel with:

SYS C: /BOOTONLY /OEM:EDRPACK

et. al. This only installs a new volume boot record with the kernel / boot protocol information adjusted.

ecm-pushbx commented 3 months ago

I added a /OEM:EDRPACK option to FreeDOS SYS, which makes the loader search for EDRPACK.SYS. @ecm-pushbx is EDRPACK.SYS the final name for it? Kernel loads fine. I might create a pull request at FreeDOS for the addition.

I will remove the DRSYS.COM file from the SvarDOS kernel package, and instead would replace the SvarDOS SYS command with this one. Kernel then has to be installed via SYS /UPDATE /OEM:EDRPACK or SYS /UPDATE /OEM:EDR (for the dual-file version), with the default installing the FreeDOS kernel. But this would be ok I think. Should not overburden the ordinary user able to install SvarDOS :)

Actually I would like it if loading / using edrdos.com was also supported. But /OEM:EDR is taken and adding a distinct /OEM:EDRDOS would be confusing. Therefore I would suggest /OEM:LEDR for loading lDOS iniload edrdos.com and probably /OEM:LEDRPACK for loading lDOS drload edrpack.sys. This could be symmetrical to a to-be-added /OEM:LMS and /OEM:LMSPACK (as my MS-DOS v4 fork files are called lmsdos.com and lmspack.sys).

Further, I would prefer to use either segment 60h (like FreeDOS load) or segment 200h by default for LEDR, LEDRPACK, LMS, or LMSPACK. This avoids possible problems with DMA that has to cross a 64 KiB boundary. (The FreeDOS original FAT12 and FAT16 loaders do their sector reads into a temporary buffer, but lDOS boot's loaders in FreeDOS compatible modes do not. Segment 60h implies that a 512-byte sector read is never crossing a 64 KiB boundary, while with segment 70h it will if the file is about 64 KiB in size. Segment 200h similarly avoids the problem.)

boeckmann commented 1 month ago

My current plan on the single-file kernel is as follows: I changed DRBIO so that the zero-uncompression code is placed in the deblocking-buffer near the start at DRBIO instead of the init code far into the file. This makes it possible to compress nearly the whole DRBIO, and brought down DRBIO size from ~36K to ~24K.

Next I want to produce a combined kernel file, with DRBIO and DRSYS combined and zero compressed. Sadly, this brings us only down to ~60K, but it is a start.

When I got this to work, we can replace the zero compression with any compression algorithm which decompressor does not require more than 512 bytes to still fit into the deblocking buffer.

The benefit of the whole procedure is that this would not depend on external components like a separate uncompressor stage prepended to the kernel.

There seem to be plenty or ready-to-go uncompressor implementations that fit into the 512 byte target. Like @ecm-pushbx pointed out, LZMA is way too slow for ancient machines. We should choose another one. For simplicity, I opt for sticking with a single algorithm, and let the uncompressor code in place for the compressed and uncompressed kernel (the deblocking space is there anyway). If the user wants to experiment with different compressors he/she can still use the ecm implementation.

ecm-pushbx commented 1 month ago

Next I want to produce a combined kernel file, with DRBIO and DRSYS combined and zero compressed

Think you meant DRDOS here.

boeckmann commented 1 month ago

Think you meant DRDOS here.

Yes.

ecm-pushbx commented 1 month ago

Like @ecm-pushbx pointed out, LZMA is way too slow for ancient machines.

Relevant reference: https://github.com/SvarDOS/edrdos/issues/69#issuecomment-2244834392

ecm-pushbx commented 1 month ago

Presumably you want to retain loading only with the EDR-DOS load protocol? As that implies the entire file is loaded so you don't need an initial loader or lDOS iniload stage (akin to Microsoft MSLOAD). Will you also support loading at better segments than 70h like 60h (FreeDOS load) and 200h?

boeckmann commented 1 month ago

Presumably you want to retain loading only with the EDR-DOS load protocol? As that implies the entire file is loaded so you don't need an initial loader or lDOS iniload stage (akin to Microsoft MSLOAD).

Yes, that is the plan. Or switching to the FreeDOS load protocol.

Will you also support loading at better segments than 70h like 60h (FreeDOS load) and 200h?

I am not yet sure about the consequences loading it to 60h. But I think it would be desireable to eventually utilize the same load protocol as the FreeDOS kernel, so switching the kernel would merely mean replacing the kernel binaries.

mateuszviste commented 1 month ago

I am not yet sure about the consequences loading it to 60h. But I think it would be desireable to eventually utilize the same load protocol as the FreeDOS kernel, so switching the kernel would merely mean replacing the kernel binaries.

That would be indeed very good, ie. not having to do a SYS after replacing kernel.sys by edrpack.sys (or maybe could it even be called kernel.sys as well?) It would definitely make it simple for SvarDOS user to switch between kernels.

BTW may I ask why a 60h is a better segment than 70h or 200h? Is it a matter of limiting memory fragmentation?

ecm-pushbx commented 1 month ago

BTW may I ask why a 60h is a better segment than 70h or 200h? Is it a matter of limiting memory fragmentation?

60h and 200h are both better than 70h. That's because they are aligned in memory at the normal sector size (512 Bytes) or for 200h even up to the maximum sector size that a DOS can generally handle (8 KiB). This is desirable because ISA DMA may be unable to read a sector so that it crosses a 64 KiB boundary in memory (ie the boundary at 10000h, 20000h, etc).

Quoth me:

Further, I would prefer to use either segment 60h (like FreeDOS load) or segment 200h by default for LEDR, LEDRPACK, LMS, or LMSPACK. This avoids possible problems with DMA that has to cross a 64 KiB boundary. (The FreeDOS original FAT12 and FAT16 loaders do their sector reads into a temporary buffer, but lDOS boot's loaders in FreeDOS compatible modes do not. Segment 60h implies that a 512-byte sector read is never crossing a 64 KiB boundary, while with segment 70h it will if the file is about 64 KiB in size. Segment 200h similarly avoids the problem.)

00600h / segment 60h / entrypoint 60h:0 is also the standard for FreeDOS's kernel.sys, albeit FreeDOS's SYS provides the /L switch to modify the default loader to use a different segment (still with entrypoint at +0:0 where the "plus" means displacement from the load address segment).

200h is a valid choice for SYS's /L switch as well. Other than that, it matches the new default for the lDOS load protocol but that actually loads to 02000h / segment 200h / entrypoint 200h:400h so its entrypoint doesn't match the FreeDOS style one. (And the lDOS entrypoint has some more requirements.)

boeckmann commented 1 month ago

Ok, I have three of four cases working: uncompressed and compressed dual-file, and uncompressed single-file.

The zero-compressed single-file is causing some trouble at the moment. In the next days I will verify if the compressed single-file is valid by uncompressing it and comparing with the uncompressed single-file. If the files match it has to be some problem with the kernel uncompressor.

Though it seems the kernel decompresses at least parts of the single-file kernel good. If I change offset 5 of the binary from 0x81 to 0x01 (kernel flag changed from compressed single-file to compressed dual-file), I get a working DOS prompt. It then reads DRDOS.SYS from disk despite the kernel being single-file because the flag instructs it to do so...

mateuszviste commented 1 month ago

60h and 200h are both better than 70h. That's because they are aligned in memory at the normal sector size (512 Bytes) or for 200h even up to the maximum sector size that a DOS can generally handle (8 KiB). This is desirable because ISA DMA may be unable to read a sector so that it crosses a 64 KiB boundary in memory (ie the boundary at 10000h, 20000h, etc).

Thank you for your thorough explanations. I needed some time to process it. :-)

So I understand this is all about the boot loader being able to perform DMA copies when loading the kernel file from disk into memory. Such load operation being carried sector-after-sector, and the (logical, or "virtual") sector size being realistically always 512 bytes, loading the kernel file at seg 60h makes no 512-bytes chunk cross the 64K boundary, while loading at seg 70h would make the 124th sector land at address 0xFF00, thus crossing the 64K boundary, possibly failing due to legacy DMA limitations. All this does not matter of course as long as the kernel file (compressed, ie. as seen by the boot loader) is less than 63744 bytes in size. Did I get it right?

ecm-pushbx commented 1 month ago

60h and 200h are both better than 70h. That's because they are aligned in memory at the normal sector size (512 Bytes) or for 200h even up to the maximum sector size that a DOS can generally handle (8 KiB). This is desirable because ISA DMA may be unable to read a sector so that it crosses a 64 KiB boundary in memory (ie the boundary at 10000h, 20000h, etc).

Thank you for your thorough explanations. I needed some time to process it. :-)

So I understand this is all about the boot loader being able to perform DMA copies when loading the kernel file from disk into memory.

Yes. And ISA DMA particularly may be used to access diskette drives.

Such load operation being carried sector-after-sector, and the (logical, or "virtual") sector size being realistically always 512 bytes, loading the kernel file at seg 60h makes no 512-bytes chunk cross the 64K boundary, while loading at seg 70h would make the 124th sector land at address 0xFF00, thus crossing the 64K boundary, possibly failing due to legacy DMA limitations.

Yes, exactingly.

All this does not matter of course as long as the kernel file (compressed, ie. as seen by the boot loader) is less than 63744 bytes in size. Did I get it right?

I didn't check the number but this is broadly correct. However, if you have a cluster size of say 8 KiB then the file size as experienced by the boot sector loader is always rounded up to a full cluster, so loading to 00700h won't cross the 10000h boundary only if the file is at most 56 KiB sized.

Other than that I always try supporting large files; the FreeDOS load protocol is now good for files up to 128 KiB (with cluster size up to 128 KiB too) or for smaller clusters (eg 2 KiB) up to 134 KiB.

boeckmann commented 1 month ago

Zero-compressed single-file is working now too. It was an error in the compressor triggered by a non-zero area of more than 7fffh bytes. Fixed by 2ff2b50

boeckmann commented 1 month ago

This seems to be completed as of fe2c3de. I will open a few follow-up issues.

ecm-pushbx commented 1 month ago

I adapted the zerocomp packer into an unframed packer that compresses a single, whole file. (Must be compiled on a >16 bit system to support files >= 64 KiB.) At https://hg.pushbx.org/ecm/edrdos/rev/988a9dd9add1

I created an inicomp depacker stage for this file format: https://hg.pushbx.org/ecm/inicomp/rev/ef69f7eef438 This still uses most of the framing of the heatshrink depacker which I based this on. In particular, it will work for both compressed and uncompressed file sizes > 64 KiB, and displays a progress indicator, and checks for error conditions like a buffer overflow.

Here's a bunch of tests running all the interesting methods on the current ecm single-file kernel. As is visible at the start this is built with the kernel-internal compression disabled for both modules. The grep shows how large each inicomp stage is, and doesn't go below 1 KiB even for the simple zerocomp depacker. The .spd file displays performance comparisons (running on dosemu2 on our amd64 Debian server without KVM usable). The edrpackd.siz file lists the final file size of edrpack.sys (the drload variant, as opposed to iniload).

edrdos/repo$ cat ovr.bat
@echo off
call c:\autowat.bat
set compressdrbio=0
set compressdrdos=0
edrdos/repo$ grep -F '*****' tmp/*/pedrdos.lst
tmp/apl/pedrdos.lst:  1903          ******************       warning: iniapl: 1328 bytes used for depacker [-w+user]
tmp/bzp/pedrdos.lst:  1903          ******************       warning: inibzp: 1040 bytes used for depacker [-w+user]
tmp/exo/pedrdos.lst:  1903          ******************       warning: iniexo: 1424 bytes used for depacker [-w+user]
tmp/hs/pedrdos.lst:  1903          ******************       warning: inihs: 1344 bytes used for depacker [-w+user]
tmp/lz4/pedrdos.lst:  1903          ******************       warning: inilz4: 1520 bytes used for depacker [-w+user]
tmp/lz/pedrdos.lst:   698          ******************  <1>  warning: localvariables has 14688 bytes [-w+user]
tmp/lz/pedrdos.lst:  1903          ******************       warning: inilz: 3008 bytes used for depacker [-w+user]
tmp/sa2/pedrdos.lst:  1903          ******************       warning: inilzsa2: 1408 bytes used for depacker [-w+user]
tmp/zer/pedrdos.lst:  1903          ******************       warning: inizero: 1072 bytes used for depacker [-w+user]
edrdos/repo$ LC_ALL=C sort tmp/edrpack.spd
    1.07s for 128 runs (    8ms / run), method         zerocomp
    3.64s for 128 runs (   28ms / run), method            lzsa2
    5.80s for 128 runs (   45ms / run), method              lz4
    6.01s for 128 runs (   46ms / run), method              bzp
   18.44s for 128 runs (  144ms / run), method              apl
   20.11s for 128 runs (  157ms / run), method          exodecr
   30.09s for 128 runs (  235ms / run), method       heatshrink
   62.50s for 128 runs (  488ms / run), method              lzd
edrdos/repo$ LC_ALL=C sort tmp/edrpackd.siz
   46128 bytes ( 56.48%), method              lzd
   46688 bytes ( 57.17%), method          exodecr
   47248 bytes ( 57.85%), method              apl
   49040 bytes ( 60.05%), method            lzsa2
   54016 bytes ( 66.14%), method              bzp
   54800 bytes ( 67.10%), method              lz4
   57344 bytes ( 70.21%), method       heatshrink
   61760 bytes ( 75.62%), method         zerocomp
   81664 bytes (100.00%), method             none
edrdos/repo$
ecm-pushbx commented 1 month ago

Forgot to mention this is the command used to run the tests:

INICOMP_METHOD="lzd exodecr apl bzp heatshrink lzsa2 lz4 zerocomp" INICOMP_SPEED_TEST=128 ./mak.sh onlypl notracelist

boeckmann commented 1 month ago

I adapted the zerocomp packer into an unframed packer that compresses a single, whole file. (Must be compiled on a >16 bit system to support files >= 64 KiB.) At https://hg.pushbx.org/ecm/edrdos/rev/988a9dd9add1

The >64K thing is why the compkern does it in two runs, one for the BIO and one for the BDOS, because as a whole they are larger than 64K. Therefore I added a new parameter to the zerocomp function to suppress the terminating zero word.

ecm-pushbx commented 1 month ago

I adapted the zerocomp packer into an unframed packer that compresses a single, whole file. (Must be compiled on a >16 bit system to support files >= 64 KiB.) At https://hg.pushbx.org/ecm/edrdos/rev/988a9dd9add1

The >64K thing is why the compkern does it in two runs, one for the BIO and one for the BDOS, because as a whole they are larger than 64K. Therefore I added a new parameter to the zerocomp function to suppress the terminating zero word.

Like I commented I think zerocomp should process its data as a stream rather than in full blocks, ideally. But my scripting requires a non-DOS host regardless so it isn't a great priority for me to support large files in 86 Mode DOS.

/* To allow input size >= 64 KiB this tool must be compiled on a 32-bit
   or 64-bit host. Compilation for 86 Mode DOS is possible but may
   truncate the data if it is too large, as size_t is likely a 16-bit
   type and the near buffer in read_file may be limited to < 64 KiB.

   Running for large files on 16-bit DOS would require to change the
   packer to stream data rather than working with entire-file buffers. */