ghaerr / elks

Embeddable Linux Kernel Subset - Linux for 8086
Other
1.02k stars 108 forks source link

NEC PC-98 enhancements to ELKS #1238

Closed ghaerr closed 2 years ago

ghaerr commented 2 years ago

Continued conversation from #1047.

tyama501 commented 2 years ago

Thank you @ghaerr

Oh I did it again, intC5 not int1C...

I will modify it and make PR. (maybe tomorrow or day after)

I might add MODE 0 to go back but I think MODE 1 that I added is almost default mode (640x400, color) so it is ok to exit with it.

tyama501 commented 2 years ago

Hello @ghaerr ,

I added host_gcls() to clear graphic plane with CLS command. I also added test98.bas to using MODE/CLS/PLOT/READ/DATA.

I will make PR for these.

basic_test98

Thank you.

tyama501 commented 2 years ago

Hello @ghaerr , I took a video with PC-9801RX21 ( intel 286, 12MHz ) so I attached it.

https://user-images.githubusercontent.com/61556504/165905837-d1f17289-1412-46f1-afec-5110913c4516.mp4

I also used "screen" command and run basic infinite loop on the other screen, then I tried to see CPU Usage in the ps. The CPU Usage is very nice new feature! It says 95% for the basic. The all the total seems not 100% but that is because of the calculating method?

ELKS_cpu_usage_20220429_164402

ghaerr commented 2 years ago

Hello @tyama501,

Thanks for the video, it's interesting to see the speed on a 12Mhz system. There's a lot going on with BASIC's floating point as well as the BIOS graphics draw, so I'm not sure what (yet) to do to speed things up.

The CPU Usage is very nice new feature! It says 95% for the basic. The all the total seems not 100% but that is because of the calculating method?

Yes, the CPU usage calculation is done every two seconds, and uses an exponential decay (=1/e currently), which essentially keeps 37% of the old value, and 63% of the new value. Also, when starting a new program such as ps that may start or terminate outside of a full 2-second window, the results may be inaccurate beyond the decay calculation.

I haven't been able to test the calculation routine outside of my own QEMU testing, and am a bit surprised at the CPU values on your test being quite a bit above 100%. I suspect this is because ps ran quite quickly. You might try running ps multiple times to see how that affects the displayed results.

If you'd like to play with the calculation like I did originally, some much larger decay values can be used from the following table in include/linuxmt/fixedpt.h (or calculate your own decay):

/* exponential factor is amount kept between cycles, add new N at (1 - factor) */
//#define EXP_E     (0.3679 * (1 << FSHIFT))    /* 1e^(2sec/2sec = 36.79% factor */
#define EXP_E       753         /* 36.79% factor 1/e^(2sec/2sec) as fixed-point */
#define EXP_125     256         /* 12.50% factor (256/2048) */
#define EXP_008     16          /*  0.80% factor (16/2048) */

A value other than EXP_E could then be used in the following line in elks/arch/i86/kernel/timer.c:

 CALC_USAGE(p->average, EXP_E, (unsigned long)p->ticks << FSHIFT);

Using EXP_008 should allow for a much more immediate view of CPU usage, with less history. Setting the correct factor depends on how much instantaneous vs historical CPU usage is desired to be shown. I used the 1 / e value because that is considered correct for the manner in which CPU load average is calculated, but that's not quite the same as CPU usage.

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

Thank you for the explanation of CPU usage. That is fine for me.

I added COLOR and DRAW command as follows. https://github.com/tyama501/elks/commit/13aa164daafbd46c34900f7a28243070a9c6b21d

basic_draw

I also added CIRCLE command but it seems system hangs during bios call. The C5 interrupt handler problem might be happened. Maybe I will remove CIRCLE command after some trial.

I have some questions for command. In the README there is third parameter for DRAW command. DRAW x,y[,a] What is [,a] for?

For CIRCLE command the third parameter is also optional. CIRCLE x,y[,r] If we drop third parameter, the previous value is used for radius?

Thank you!

tyama501 commented 2 years ago

(Oops I forgot to reject the third parameter in host-pc98 when negative...)

ghaerr commented 2 years ago

Hello @tyama501,

I also added CIRCLE command but it seems system hangs during bios call.

That is strange, does it always hang on CIRCLE, or sometimes work? Are there multiple sources of documentation of exactly how LIO works for circle drawing?

In the README there is third parameter for DRAW command. DRAW x,y[,a] What is [,a] for?

I grabbed this from the sinclair basic online manual. It says "An extra frill with DRAW is that you can use it to draw parts of circles instead of straight lines, by using an extra number to specify an angle to be turned through...". It seems with 3rd parameter, it is an arc-drawing routine. Perhaps we should not implement 3rd parameter at all, and only have DRAW draw lines.

For CIRCLE command the third parameter is also optional. CIRCLE x,y[,r] If we drop third parameter, the previous value is used for radius?

The same documentation above shows the radius parameter used at all times. Perhaps to keep things simple, we should not implement optional parameters. This will simplify both the interpreter as well as the host-pc98.c processing. If we take that option, CIRCLE would take 3 parameters and DRAW 2 parms.

I added COLOR and DRAW command as follows.

The code looks good, but we are writing lots of code to parse optional parameters, and having multiple procedures that do nearly the same thing. Perhaps consider instead of adding seperate parse1or2IntCmd and parse2or3IntCmd functions, just enhance the existing parse2IntCmd to take a parameter of the number of integers expected and parse those parameters, and rename it "parseNIntCmd(int n)". Then pass it the expected number of parameters for each of the POSITION, PIN, PINMODE, PLOT, DRAW and COLOR commands. The interpreter will remain smaller and simpler, and also will the parameter processing in host-pc98.c.

(Oops I forgot to reject the third parameter in host-pc98 when negative...)

Yes, that won't be needed if we take the fixed-parameter approach!

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

Thank you for adding CONFIG_TIME_RTC_LOCALTIME. It is working.

Back to the Basic, I modified parseTwoIntCmd to parseNIntCmd as follows. https://github.com/tyama501/elks/commit/a81142c0588ed94e4e53392414d92b5b93b57224

For COLOR command, I don't use 2nd paramater "bg" but I treated as 2 parameter cmd since IBM may use it.

For CIRCLE command, it is still always hang so I wanted to add intC5 asm code for trying but somehow I cannot link with the error like this.

ia16-elf-gcc -fno-inline -melks-libc -mcmodel=small -mno-segment-relocation-stuff -mtune=i8086 -Wall -Os -o basic basic.o host.o host-pc98.o intC5-pc98.o host-pc98.o:function host_mode: error: undefined reference to 'intc5_handler!' collect2: error: ld returned 1 exit status gmake: *** [Makefile:36: basic] Error 1

Makefile is like this. Is there any other part I need to modify? ifeq ($(CONFIG_ARCH_PC98), y) OBJS += host-pc98.o intC5-pc98.o endif

The code is like this. intC5-pc98.S .arch i8086, nojumps .code16 .text

    .global intc5_handler

intc5_handler: iret

host-pc98.c extern void intc5_handler(void); ... // Set interrupt handler for 0xC5 intvec = _MK_FP(0, 0xC5 << 2); intvec = (unsigned long) ((unsigned long __far ) intc5_handler);

ghaerr commented 2 years ago

Thank you for adding CONFIG_TIME_RTC_LOCALTIME. It is working.

Thank you for your testing! I'm glad we're done with time functions :)

I modified parseTwoIntCmd to parseNIntCmd as follows.

Looks good!

intC5-pc98.S .arch i8086, nojumps .code16 .text .global intc5_handler intc5_handler: iret

That is very strange, I am not sure what the problem is.

Makefile is like this. Is there any other part I need to modify?

That seems correct, your ld output seems to show all proper .o files are being included.

Perhaps use ia16-elf-objdump -t to display the symbol table on intC5-pc98.o as well as host-pc98.o, to see that intc5_handler symbol(s) match up.

It is possible that following statement could generate code thinking that intc5_handler is in data segment: (?)

 *intvec = (unsigned long) ((unsigned long __far *) intc5_handler);

Use ia16-elf-objdump -D -r -Mi8086 to disassemble host-pc98.o to see precise reference to intc5_handler. Go ahead and post disasm output and I will try to help.

Thank you!

ghaerr commented 2 years ago

.text .global intc5_handler intc5_handler:

Can you post your intC5-pc98.S file? Please check to be very sure the spelling between the .global and definition of the intc5_handler symbol is the same. It seems that as will allow a .global of an undefined symbol, and no error will be shown, which would be the case if the names don't match.

tyama501 commented 2 years ago

Hello @ghaerr ,

ia16-elf-objdump -t intC5-pc98.o 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 g .text 00000000 intc5_handler

ia16-elf-objdump -t host-pc98.o | less 00000000 UND 00000000 intc5_handler! 00000000 UND 00000000 intc5_handler& 00000000 UND 00000000 intc5_handler

It seems UND.

Attached ia16-elf-objdump -D -r -Mi8086 host-pc98.o objdump_intc5.log

intc5-pc98.S .arch i8086, nojumps .code16 .text

.global intc5_handler

intc5_handler: iret

Thank you.

tyama501 commented 2 years ago

Not about circle but one more thing,

this basic program works BAK98.txt

READ_ok

but this program does not work. test98.txt

READ

Only the difference is line number. I renumbered it. It is strange that host_color is called 7 times not 3.

Is there some limitation for line number?

tyama501 commented 2 years ago

Sorry GOTO number should be different...

ghaerr commented 2 years ago

Hello @tyama501,

a16-elf-objdump -t intC5-pc98.o 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 g .text 00000000 intc5_handler

This shows a zero-length text segment. Perhaps run ia16-elf-size on it to check for non-zero text size.

intc5-pc98.S .arch i8086, nojumps .code16 .text .global intc5_handler intc5_handler: iret

Are these statements indented with at least one tab or space character? Perhaps the macros are not recognized in the first column.

Sorry GOTO number should be different...

Ok does that fix your BASIC problem, or is it something else?

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

ia16-elf-size intC5-pc98.o text data bss dec hex filename 1 0 0 1 1 intC5-pc98.o

ia16-elf-objdump -D -r -Mi8086 intC5-pc98.o intC5-pc98.o: file format elf32-i386

Disassembly of section .text:

00000000 : 0: cf iret

Are these statements indented with at least one tab or space character?

Yes. It seems .S file is not able to be attached here so I renamed and attached it. intC5-pc98.txt

Ok does that fix your BASIC problem, or is it something else?

Yes. It is fixed.

Thank you.

tyama501 commented 2 years ago

If there is easier (and better) way to handle it in the kernel maybe we can try that.

ghaerr commented 2 years ago

Hello @tyama501,

Add the following two lines to basic/Makefile:

.S.o:
    $(CC) $(CFLAGS) -c -o $*.o $<

The problem is incorrect flags being passed to ia16-elf-as. This should allow you to debug, I will fix the problem more fully in a PR.

Thank you!

tyama501 commented 2 years ago

Thank you @ghaerr ,

It worked!

circle_ok

ghaerr commented 2 years ago

It worked!

That's great - so the circle command requires an INT C5h handler?

tyama501 commented 2 years ago

Yes. Maybe the circle is complex function and takes more time than plot and line.

May I make PR after cleaning up?

ghaerr commented 2 years ago

May I make PR after cleaning up?

Yes, of course. Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

I will start to think about supporting IDE hard drives but I have been noticed an issue.

Now, there is no /dev/hdc and /dev/hdd for PC-98 so we cannot access for third or fourth drive. image

This is because inode.c has fd2/fd3 instead of hdc/hdd.
https://github.com/jbruchon/elks/blob/d70317c5ccd39a39535de148976a5136ed0ccf3d/elks/fs/msdos/inode.c

ifdef CONFIG_ARCH_IBMPC

{ "hdc",    S_IFBLK | 0644, MKDEV(3, 64) },
{ "hdd",    S_IFBLK | 0644, MKDEV(3, 96) },

endif

...

ifdef CONFIG_ARCH_PC98

{ "fd2",   S_IFBLK | 0644, MKDEV(3, 192)},
{ "fd3",   S_IFBLK | 0644, MKDEV(3, 224)},

endif

I assume DEVDIR_SIZE of devnods[DEVDIR_SIZE] has some limitation so this is written like this but is there a way to add hdc(hdc1-hdc4) and hdd(hdd1-hdd4) to PC-98?

tyama501 commented 2 years ago

I am thinking of supporting up to 4 drives in total for IDE + SCSI.

ghaerr commented 2 years ago

Hello @tyama501,

Now, there is no /dev/hdc and /dev/hdd for PC-98 so we cannot access for third or fourth drive.

Perhaps remove /dev/hdb[2-4], that will give you 3 more directory locations? I'm don't know how many people use FAT partitions 2-4 on the second drive... ?

This is because inode.c has fd2/fd3 instead of hdc/hdd.

You're welcome to change those too, if you don't think you will use them.

I assume DEVDIR_SIZE of devnods[DEVDIR_SIZE] has some limitation so this is written like this

Yes, DEVDIR_SIZE is a special construct from it's original contribution to 386 Linux, and is limited to 31. It is set to 30 now in include/linuxmt/msdos_fs_sb.h. You can increase it by 1 to 31 to gain one more entry.

is there a way to add hdc(hdc1-hdc4) and hdd(hdd1-hdd4) to PC-98?

Not easily, unless you remove other entries (31 max with the change above). You can change whatever you would like in the entire table for CONFIG_ARCH_PC98. Perhaps if this is most important to you, you may not need /dev/rd0, eth, tcp etc.

Perhaps best for now would be to entirely duplicate struct msdos_devdir_entry devnods[DEVDIR_SIZE] ifdef'd for PC-98, and leave the other table for IBM PC. Then you can completely decide what you will support for PC-98 without bothering about IBM PC.

I am thinking of supporting up to 4 drives in total for IDE + SCSI.

That should work with the code you contributed for the PDA rewrite. I am not sure you need all four partitions for four drives, but that will be your decision, and 31 entries max.

Thank you!

tyama501 commented 2 years ago

Thank you @ghaerr ,

I understand. Since I am using fd3/fd4, I will think removing others.

Using SCSI emulator device with SD Card, it is very easy to increase drives and partitions. We are living in 2022 :)

ghaerr commented 2 years ago

Hello @tyama501,

Using SCSI emulator device with SD Card, it is very easy to increase drives and partitions. We are living in 2022 :)

I have researched the way MSDOS /dev emulation works, and finally understand we can increase the size beyond 31, after all. I will post a PR that allows us to keep the same /dev directory for FAT filesystems without having to ifdef IBM PC vs PC-98. I will also include all /dev/hd[cd][1234] so you should not have a problem.

Thank you!

tyama501 commented 2 years ago

Wow thank you @ghaerr !

tyama501 commented 2 years ago

Hello @ghaerr ,

With this modification, I could mount IDE image. I also added SCSI device type to avoid detecting SCSI cdrom as harddisk. https://github.com/tyama501/elks/commit/935b2534155ef22174f0ffb2d61e4f769e1cbb98

I got a real IDE hard drive for PC-9801BX so I will test with it.

ide1

ide2

Thank you!

ghaerr commented 2 years ago

Hello @tyama501,

Very nice, so now PC-98 supports both IDE and SCSI drives :)

After testing on real hardware, definitely post PR.

I am wondering whether on lines 183 and 1136, whether all hard drives just have bit 7 set? That is, would the following be a possibly better way:

if (drive & 0x80) {
...

That is, BIOS IDE and SCSI always have 0x80 set, whereas floppy drives (1440k and 1232k) never? It seems that PDA numbering scheme you posted earlier worked that way.

I also added SCSI device type to avoid detecting SCSI cdrom as harddisk.

What is SCSI cdrom PDA value? If it has 0x80 bit set, then my above idea is not good.

Also, smaller point, but perhaps in all cases of "scsi_id + 0xA0" or "drive + 0x80" just use OR to get result: "scsi_id | 0xA0" or "drive | 0x80".

In both cases, these are just ideas of possibly optimization, you are welcome to leave as-is.

Thank you!

tyama501 commented 2 years ago

Hello @gharr ,

I am wondering whether on lines 183 and 1136, whether all hard drives just have bit 7 set? That is, BIOS IDE and SCSI always have 0x80 set, whereas floppy drives (1440k and 1232k) never?

The PDA of 1232k is 0x90 so we cannot distinguish just with bit7.

What is SCSI cdrom PDA value? If it has 0x80 bit set, then my above idea is not good.

All SCSI device PDA including cdrom is 0xA0. I was told on twitter that we can distinguish devices with device types obtained by int1B, AH = 0x14 so I added this command prior to int1B, AH = 0x84.

Also, smaller point, but perhaps in all cases of "scsi_id + 0xA0" or "drive + 0x80" just use OR to get result: "scsi_id | 0xA0" or "drive | 0x80".

Good, I will modify this.

Thank you!

tyama501 commented 2 years ago

With gcc optimization, somehow it seems "scsi_id + 0xA0" or "drive + 0x80" get smaller result than "scsi_id | 0xA0" or "drive | 0x80". I leave it as-is.

tyama501 commented 2 years ago

Hello @ghaerr ,

Something is wrong with the PC-9801BX IDE. It seems 0x81 - 0x83 is responding although they are not available. It might be related to known PC-9801BX 500MB boundary issue but I need to debug.

PC-9801BX_IMG_20220521_174747

I clipped large disk to 544MBytes = (500MBytes + 44MBytes) so hda/hdb might be caused by this but also hdc/hdd is doubled.

tyama501 commented 2 years ago

Hello @ghaerr ,

Not perfect but I got a progress.

I could read/write/boot ide drive with the following modification. https://github.com/tyama501/elks/commit/7f8fc1f74dc8c2cef96793e4370aecdd9168db1c

PC-9801BX_IMG_20220525_022141

I got an information that drive connection can be obtained from system memory. I used 0000:055D to detect the ide drive. (Maybe 0000:055C can be used to detect number of floppies)

0000:055C for FDD bit3 FD3(0x93) bit2 FD2(0x92) bit1 FD1(0x91) bit0 FD0(0x90)

0000:055D for HDD bit3 HD3(0x83) bit2 HD2(0x82) bit1 HD1(0x81) bit0 HD0(0x80)

The thing not perfect is that I couldn't read/wite the drive without using pc98 boot selector. If I boot from floppy directly it get error as follows. I suspect some initialization process is needed for the disk. I will test a little more.

PC-9801BX_IMG_20220525_021350

tyama501 commented 2 years ago

Oh, and not related to ide but I noticed one more issue.

Unsupported option for copy like -a always hangs up system. cp -a (I always do this in Linux...)

Does this happen in ELKS for IBM too?

ghaerr commented 2 years ago

Hello @tyama501,

I got an information that drive connection can be obtained from system memory. I used 0000:055D to detect the ide drive.

Good information, it seems to be working for HDD but not FDD?

The thing not perfect is that I couldn't read/wite the drive without using pc98 boot selector. If I boot from floppy directly it get error as follows.

What do you mean "boot selector"? I don't quite understand the difference... are you talking about the byte PDA value being different passed to the BIOS when using boot value versus value at 0:055C?

Unsupported option for copy like -a always hangs up system. cp -a

Another bug! Looking at the cp.c source code, shows that cp will hang when any invalid option is specified. I will fix.

Thank you!

tyama501 commented 2 years ago

Thank you @ghaerr ,

I will consider using 055C for FD In future.

I mean the boot selector is the name for IPL in the first cylinder of hd like mbr. I am now asking the maintainer of the boot selector if he put some initialization code for hd.

tyama501 commented 2 years ago

Hello @ghaerr ,

I am now asking the maintainer of the boot selector if he put some initialization code for hd.

I got an answer from him. We need to put MODE SET command int1B, AH=8E prior to the first read of the disk. I added it for the case the boot selector is not used. https://github.com/tyama501/elks/commit/dc5604268f36810463ec00e26a6525521cd7ef1a

I found an another problem that something is wrong when I mounted some of my image on the emulator. There are directories that cannot read correctly.

disk_250MB

They can read correctly from dos.

Were there maximum directory counts? The root directory is also strange that not all file names are shown with "ls -al" There should be 40 directories and 12 files.

disk_250MB_2
tyama501 commented 2 years ago

Here is the parameters when mounting.

image
ghaerr commented 2 years ago

Hello @tyama501,

I got an answer from him. We need to put MODE SET command int1B, AH=8E prior to the first read of the disk.

Good news. Does that then fix all of the problems (except the root directory issue) you were having?

I found an another problem that something is wrong when I mounted some of my image on the emulator. There are directories that cannot read correctly.

This is a very serious problem - and will need to be identified and fixed ASAP. I have a few questions:

From your first posted screenshot, it appears as though perhaps ELKS is interpreting a multibyte file size field incorrectly. I will need a disk image to most easily solve this.

We will have to track down exact specifics of HD disk image to see what exactly is on disk and why ELKS filesystem driver is interpreting incorrectly. I can't remember the exact issues of why this happened last time, but I seem to remember it was related to having the root directory start location computed incorrectly.

Were there maximum directory counts?

No, except for the root directory, which has a FAT-specified max size.

Here is the parameters when mounting.

It shows # fats=2 fat table size=250 and root dir location=502 seems correct. However, with 3072 directory entries at 32 byte each, starting at 502, (total 96 sectors for directory entries), it would seem that data location (=dloc) should start at 598, not 694. Is this a 1024 byte or 512 byte sector HD? It is possible there is some mixup occurring with HD being 512 byte sectors but using 1024 byte sectors.... ? I will have to look further into this.

Thank you!

tyama501 commented 2 years ago

Thank you @ghaerr ,

This is the image for IDE so it cannot be read without the driver modification.

I will think the answer to your questions tomorrow.

Also the image is 250Mbytes so I need to think how to give you. And maybe need to convert to the file that dos-boxX can read.

ghaerr commented 2 years ago

This is the image for IDE so it cannot be read without the driver modification.

Well, we can read IDE drive using QEMU and IBM PC version of ELKS, so no problem there. It would be nice to have HD drive image that has never been written to by ELKS.

Also the image is 250Mbytes so I need to think how to give you.

We could use free account with dropbox.com and share it, not sure what maximum GitHub attachment is. Perhaps split file into smaller chunks and upload to GitHub, I can then cat them back together.

And maybe need to convert to the file that dos-boxX can read.

Probably don't need to change format of file, although I have dosboxX also. I will be debugging using QEMU and HD IDE image using IBM PC, most likely.

tyama501 commented 2 years ago

Oh, I forgot there might be Japanese "kanji" directory. That can cause problem. I will check that too.

tyama501 commented 2 years ago

It is PC-98 format partition.

ghaerr commented 2 years ago

there might be Japanese "kanji" directory. That can cause problem. I will check that too.

There is nothing I have read in FAT documentation that should cause problem for "kanji", but we may have to add other code to handle it.

It is PC-98 format partition.

That just means that the boot partition is PC-98, but the partition itself is FAT, correct? What might be needed would be to somehow extract partition from overall disk image. This would be done by dd skip= command to skip partition sector and any partitions prior to this one. Then mount using raw device rather than partition device (on PC-98 or IBM PC).

ghaerr commented 2 years ago

Hello @tyama501,

it would seem that data location (=dloc) should start at 598, not 694.

If this IDE disk is 512 bytes/sector, then it seems data start location is incorrect. Should be directly after root directory, which starts at 502. Since there are 3072 root directory entries @ 32 bytes/each (=96), data start (dloc=) should be 598. Instead it is showing 694.

This would be calculated to 694 if bytes/sector were incorrectly set at 1024, which is proper for FD1232 drive, but not HD. You should check your bioshd.c code to ensure that drivep->sector_size is set to 512.

Also, following code is used to calculate start data location in elks/fs/msdos/inode.c:

    sb->data_start = sb->dir_start +
        (sb-> dir_entries >> (SECTOR_BITS_SB(s) - MSDOS_DIR_BITS));

This should be calculated as: dir_entries (3072 >> (9 - 4) = (3072 >> 5) = 96. You might want to put in printk to confirm. The SECTOR_BITS_SB(s) is set in the code above it:

    switch (sb->sector_size) {
    case 512:
        sb->sector_bits = 9;    /* log2(sector_size) */
        sb->msdos_dps = 16;     /* SECTOR_SIZE / sizeof(struct msdos_dir_entry) */
        sb->msdos_dps_bits = 4; /* log2(msdos_dps) */
        break;
    case 1024:
        sb->sector_bits = 10;   /* log2(sector_size) */
        sb->msdos_dps = 32;     /* SECTOR_SIZE / sizeof(struct msdos_dir_entry) */
        sb->msdos_dps_bits = 5; /* log2(msdos_dps) */
        break;

'dps' means directory entries per sector.

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

I haven't have time to check sector size yet but I think I could extract raw FAT image. Can you get this? https://drive.google.com/file/d/1BO7CUlAuIhqNnbYQAylh36bimUFmpn5I/view?usp=sharing

I deleted Japanese kanji directory etc. but DEVEL/JWASM directory still not shown properly from ELKS.

Thank you.

ghaerr commented 2 years ago

Hello @tyama501,

I could extract raw FAT image. Can you get this?

Yes, I have pulled it down and will look at it over the weekend :)

Can you tell me what format program was used to create this partition? (And options used, if known).

I deleted Japanese kanji directory etc. but DEVEL/JWASM directory still not shown properly from ELKS.

I am not surprised, as I think the kanji use only affects the 8.3 name portion of the directory entry, not the structure of the FAT filesystem.

Thank you!

tyama501 commented 2 years ago

Thank you @ghaerr ,

The one thing I concerned about the kanji is that s-jis has double bytes that can be mistake to escape characters causing console mess.

ghaerr commented 2 years ago

Hello @tyama501,

The one thing I concerned about the kanji

Yes, there are issues with kanji, when kanji is used in the directory structure as well.

However, the problem is not a sector size issue. The problem is incompatible FAT directory format:

I have debugged first problem, from hex dump of root directory: the FAT directory format on your HD image is non-standard (according to MSFT long filename documentation) and that is what is causing the problem. You will need to read up on FAT directory entry format in order to understand the problem. It looks like, for some reason, perhaps PC-98 MSDOS or perhaps PC-98 FAT format is slightly different than standard. We will need documentation in order to implement the proper solution.

I have attached the Microsoft FAT32 spec (see Section 7 at end of document for what long filename entries are supposed to look like. Meanwhile, this is what entries look like on disk image (use hd to dump):

03ec00: 4b 45 52 4e 45 4c 20 20  53 59 53 20 00 00 d7 12  KERNEL  SYS ....
03ec10: 9d 52 00 00 00 00 a3 bc  5a 52 02 00 a2 c2 00 00  .R......ZR......
03ec20: 43 4f 4d 4d 41 4e 44 20  43 4f 4d 20 00 00 d7 12  COMMAND COM ....
03ec30: 9d 52 00 00 00 00 8c 5c  63 52 0f 00 4c 0e 01 00  .R.....\cR..L...
03ec40: 41 55 54 4f 45 58 45 43  42 41 54 20 00 00 95 b4  AUTOEXECBAT ....
03ec50: 63 51 00 00 00 00 95 b4  63 51 26 93 9c 03 00 00  cQ......cQ&.....
03ec60: 43 4f 55 4e 54 52 59 20  53 59 53 20 00 00 08 73  COUNTRY SYS ...s
03ec70: 5b 50 00 00 00 00 39 b4  93 4f 26 00 2a 76 00 00  [P....9..O&.*v..
03ec80: 46 44 43 4f 4e 46 49 47  53 59 53 20 00 00 65 97  FDCONFIGSYS ..e.
03ec90: e4 52 00 00 00 00 65 97  e4 52 2e 00 cd 02 00 00  .R....e..R......
03eca0: e5 44 4f 53 20 20 20 20  20 20 20 10 00 00 08 73  .DOS       ....s
03ecb0: 5b 50 00 00 00 00 08 73  5b 50 2f 00 00 00 00 00  [P.....s[P/.....
03ecc0: 4b 57 43 31 38 36 33 32  53 59 53 20 00 00 e7 12  KWC18632SYS ....
03ecd0: 9d 52 00 00 00 00 95 bc  5a 52 d7 03 bb cc 00 00  .R......ZR......
03ece0: 4b 57 43 38 36 31 36 20  53 59 53 20 00 00 e8 12  KWC8616 SYS ....
03ecf0: 9d 52 00 00 00 00 a3 bc  5a 52 e4 03 a2 c2 00 00  .R......ZR......
03ed00: e5 50 54 20 20 20 20 20  20 20 20 10 00 00 11 73  .PT        ....s
03ed10: 5b 50 00 00 00 00 11 73  5b 50 f1 03 00 00 00 00  [P.....s[P......
03ed20: 52 45 41 44 4d 45 20 20  42 41 54 20 00 00 12 73  README  BAT ...s
03ed30: 5b 50 00 00 00 00 ca b3  4e 4c 5f 04 21 00 00 00  [P......NL_.!...
03ed40: 52 45 41 44 4d 45 4a 41  48 54 4d 20 00 00 12 73  READMEJAHTM ...s
03ed50: 5b 50 00 00 00 00 36 8b  e7 4e 60 04 73 19 00 00  [P....6..N`.s...
03ed60: e5 64 00 65 00 70 00 74  00 68 00 0f 00 93 31 00  .d.e.p.t.h....1.
03ed70: 30 00 30 00 00 00 ff ff  ff ff 00 00 ff ff ff ff  0.0.............
03ed80: e5 45 50 54 48 31 30 30  20 20 20 10 00 c3 43 0a  .EPTH100   ...C.
03ed90: 5c 50 5c 50 00 00 43 0a  5c 50 62 04 00 00 00 00  \P\P..C.\Pb.....
03eda0: 44 45 56 45 4c 20 20 20  20 20 20 10 00 c5 43 0a  DEVEL      ...C.
03edb0: 5c 50 5c 50 00 00 43 0a  5c 50 99 04 00 00 00 00  \P\P..C.\P......
03edc0: e5 66 00 64 00 39 00 38  00 5f 00 0f 00 a5 33 00  .f.d.9.8._....3.
03edd0: 31 00 33 00 00 00 ff ff  ff ff 00 00 ff ff ff ff  1.3.............
03ede0: e5 44 39 38 5f 33 31 33  20 20 20 10 00 84 46 0a  .D98_313   ...F.
03edf0: 5c 50 5c 50 00 00 46 0a  5c 50 b9 92 00 00 00 00  \P\P..F.\P......
03ee00: e5 6a 00 65 00 64 00 31  00 39 00 0f 00 6f 34 00  .j.e.d.1.9...o4.
03ee10: 6e 00 00 00 ff ff ff ff  ff ff 00 00 ff ff ff ff  n...............
03ee20: e5 45 44 31 39 34 4e 20  20 20 20 10 00 86 46 0a  .ED194N    ...F.
03ee30: 5c 50 5c 50 00 00 46 0a  5c 50 d9 92 00 00 00 00  \P\P..F.\P......
03ee40: e5 72 00 65 00 61 00 64  00 31 00 0f 00 26 30 00  .r.e.a.d.1...&0.
03ee50: 32 00 00 00 ff ff ff ff  ff ff 00 00 ff ff ff ff  2...............
03ee60: e5 45 41 44 31 30 32 20  20 20 20 10 00 3a 0e 93  .EAD102    ..:..
03ee70: 75 50 75 50 00 00 0e 93  75 50 2e 93 00 00 00 00  uPuP....uP......
03ee80: e5 41 4b 20 20 20 20 20  20 20 20 10 00 00 98 0c  .AK        .....
03ee90: 88 50 00 00 00 00 98 0c  88 50 ec 92 00 00 00 00  .P.......P......
03eea0: e5 44 00 69 00 74 00 74  00 31 00 0f 00 57 35 00  .D.i.t.t.1...W5.
03eeb0: 30 00 00 00 ff ff ff ff  ff ff 00 00 ff ff ff ff  0...............
03eec0: e5 49 54 54 31 35 30 20  20 20 20 10 00 60 10 20  .ITT150    ..`. 
03eed0: db 50 db 50 00 00 10 20  db 50 3e 98 00 00 00 00  .P.P... .P>.....
03eee0: e5 6d 00 73 00 6b 00 33  00 31 00 0f 00 52 34 00  .m.s.k.3.1...R4.
03eef0: 00 00 ff ff ff ff ff ff  ff ff 00 00 ff ff ff ff  ................
03ef00: e5 53 4b 33 31 34 20 20  20 20 20 10 00 21 4f 17  .SK314     ..!O.
03ef10: fe 50 fe 50 00 00 4f 17  fe 50 4e 98 00 00 00 00  .P.P..O..PN.....
03ef20: e5 54 52 49 20 20 20 20  20 20 20 10 00 41 50 26  .TRI       ..AP&
03ef30: 02 51 02 51 00 00 50 26  02 51 04 9a 00 00 00 00  .Q.Q..P&.Q......
03ef40: e5 4d 44 34 38 4f 20 20  20 20 20 10 00 84 ae b1  .MD48O     .....
03ef50: 08 51 08 51 00 00 ae b1  08 51 08 9a 00 00 00 00  .Q.Q.....Q......
03ef60: e5 bf 8e 9a 20 20 20 20  20 20 20 10 00 00 9c 81  ....       .....
03ef70: 54 52 00 00 00 00 9c 81  54 52 c1 d8 00 00 00 00  TR......TR......
03ef80: e5 70 00 6d 00 64 00 70  00 76 00 0f 00 b2 39 00  .p.m.d.p.v....9.
03ef90: 32 00 67 00 00 00 ff ff  ff ff 00 00 ff ff ff ff  2.g.............
03efa0: e5 4d 44 50 56 39 32 47  20 20 20 10 00 b1 26 b6  .MDPV92G   ...&.
03efb0: 08 51 08 51 00 00 26 b6  08 51 31 9b 00 00 00 00  .Q.Q..&..Q1.....
03efc0: e5 61 00 73 00 31 00 34  00 31 00 0f 00 4f 38 00  .a.s.1.4.1...O8.
03efd0: 00 00 ff ff ff ff ff ff  ff ff 00 00 ff ff ff ff  ................
03efe0: e5 53 31 34 31 38 20 20  20 20 20 10 00 93 5b 13  .S1418     ...[.
03eff0: 0e 51 0e 51 00 00 5b 13  0e 51 43 9b 00 00 00 00  .Q.Q..[..QC.....
03f000: e5 4d 44 50 4a 20 20 20  20 20 20 10 00 00 97 9d  .MDPJ      .....
03f010: 10 51 00 00 00 00 97 9d  10 51 71 a0 00 00 00 00  .Q.......Qq.....
03f020: e5 4d 50 20 20 20 20 20  20 20 20 10 00 00 5c 28  .MP        ...\( <-- long filename MUSIC entry 1
03f030: 16 51 00 00 00 00 5c 28  16 51 8a a0 00 00 00 00  .Q....\(.Q......
03f040: e5 55 53 49 43 20 20 20  20 20 20 10 00 00 fd 15  .USIC      ..... <-- and entry 2
03f050: 1e 51 00 00 00 00 fd 15  1e 51 a4 a0 00 00 00 00  .Q.......Q......
03f060: e5 44 56 4f 4c 20 20 20  20 20 20 10 00 3b 53 11  .DVOL      ..;S.
03f070: 4b 51 4b 51 00 00 53 11  4b 51 b5 a0 00 00 00 00  KQKQ..S.KQ......
03f080: e5 49 4e 49 58 20 20 20  20 20 20 10 00 95 4b 89  .INIX      ...K.
03f090: 52 51 52 51 00 00 4b 89  52 51 bc a0 00 00 00 00  RQRQ..K.RQ......
03f0a0: 43 4c 49 4f 39 38 20 20  48 20 20 20 00 00 eb 91  CLIO98  H   ....
03f0b0: 26 52 00 00 00 00 50 90  26 52 d6 d7 44 02 00 00  &R....P.&R..D...
03f0c0: e5 73 00 63 00 73 00 69  00 32 00 0f 00 f3 30 00  .s.c.s.i.2....0.
03f0d0: 35 00 00 00 ff ff ff ff  ff ff 00 00 ff ff ff ff  5...............
03f0e0: e5 43 53 49 32 30 35 20  20 20 20 10 00 38 35 8e  .CSI205    ..85.
03f0f0: 2a 52 2a 52 00 00 35 8e  2a 52 d1 d7 00 00 00 00  *R*R..5.*R......

Note entries for "MUSIC": they both use start 0xE5, which is reserved for deleted entry. Second entry then uses first character from first entry, and also has 0xE5 "LDIR_Ord" value, which is incompatible with Section 7 FAT LFN format.

If you study this hex dump carefully, you can see all missing directory entries. Not all use 0xE5, some are standard.

If I change ELKS long filename directory code to ignore deleted entry 0xE5, we get the following when using ls -l or ls on /mnt (your test image):

Screen Shot 2022-05-27 at 4 38 07 PM

Note that most directories, but not all, now show. Note that 0xE5 displays as block character, and first character is not replaced from previous LFN (long filename entry).

You can test this using the following diff:

diff --git a/elks/fs/msdos/dir.c b/elks/fs/msdos/dir.c
index b384b107..598d1ef9 100644
--- a/elks/fs/msdos/dir.c
+++ b/elks/fs/msdos/dir.c
@@ -100,11 +100,11 @@ int FATPROC msdos_get_entry_long(
        is_long = 0;
        *ino = msdos_get_entry(dir,pos,bh,&de);
        while (*ino != (ino_t)-1L) {
-               if (de->name[0] == 0)           /* empty  entry and stop reading*/
+               if (de->name[0] == 0) {         /* empty  entry and stop reading*/
                        break;
-               else if (((unsigned char *)(de->name))[0] == DELETED_FLAG) {    /* empty entry*/
-                       is_long = 0;
-                       oldpos = *pos;
+               //} else if (((unsigned char *)(de->name))[0] == DELETED_FLAG) {        /* empty entry*/
+                       //is_long = 0;
+                       //oldpos = *pos;
                } else if (de->attr ==  ATTR_EXT) {             /* long filename entry*/
                        int slot = 0;
                        register struct msdos_dir_slot *ds = (struct msdos_dir_slot *) de;

I cannot proceed with fix until we find documentation on exactly how non-standard directory format is working. We need to also learn whether FreeDOS works with your image, or requires PC-98 DOS. I am looking at FreeDOS source now to try to learn something.

Also, am still looking into your second problem, why file sizes are incorrect. I am sure it has to do with also incompatible directory entry, which we will need documentation for.

Microsoft FAT32 Spec (SDA Contribution).pdf.zip

Thank you!

tyama501 commented 2 years ago

Thank you for your observation @ghaerr ,

OK, I think I am using some freeware for partitioning and also using some freeware to read/write files from windows for many times that caused the problem.

If it is incompatible for FAT standards we don't need to care about it.

I have been noticed before FreeDOS can access these files but when I used some tool to show directories after read/write from windows, it was strange. So maybe the windows tool is causing the problem.

I still not sure the sector is recognized as 1024 or not. I think that is common codes for SCSI and IDE...