roelandjansen / pcmos386v501

PC-MOS/386 v5.01 and up, including cdrom driver sources.
GNU General Public License v3.0
418 stars 60 forks source link

kernel broken #42

Closed ghost closed 5 years ago

ghost commented 5 years ago

Finally got my workbench and tools set up. First thing I discover is, the kernel is already broken.

Here's the eval kernel from a SHIPMOS floppy install, boots up:

image

Here's the eval kernel I built from the original files in ref / kernel.zip, boots up, notice the different serial number:

image

Here's the eval kernel I built from the git src / kernel, which includes patches as of yesterday. It hangs, never reaches the command prompt:

image

Changing MAKEMOS.BAT to build the R&D instead of eval was a bad idea. You lose your reference point, which is the shipped kernel from SHIPMOS. Never lose your reference point! Figuring out what the R&D kernel does, is a job for later, when all bugs are fixed in the eval kernel. Until then, leave well enough alone!

Here's the diff between the original files that work, vs the current git source that builds a broken kernel. Anyone recognize their work?

zdiff.txt

sub205 commented 5 years ago

I've seen one of the changes here:

https://github.com/roelandjansen/pcmos386v501/pull/35/commits/5f94ef2a41aa1a2dd3498c095764bbc0f07f9fe6

ghost commented 5 years ago

That fixes an obvious typo, so I doubt it breaks the kernel. But to be sure, it should be tested alone.

Only 5 kernel patches into the project, and it's already broken? People not even bothering to see if it boots?

Ugh.

andrewbird commented 5 years ago

I just built and booted both $$eval and $$mos from current git master.

$$eval.sys

pc-mos-eval

$$mos.sys

pc-mos-mos

ghost commented 5 years ago

Working in your environment doesn't make it work in mine. Your boot messages are different.

andrewbird commented 5 years ago

Perhaps you should identify the problem with your environment before accusing people of not testing?

ghost commented 5 years ago

You got it backwards, man. My environment is not the problem.

I used only the files from the original kernel.zip as provided by TSL, built the kernel without adding any changes of my own, and the resulting kernel boots fine, using the environment provided in SHIPMOS.

Then I used the current 5 patches committed to the project, and it fails. My environment is not the problem. One or more of the 5 patches is the problem.

I will not spend my time cleaning up after lazy programmers who believe that if it works for them, it can't be their problem. If that's how this project is run, I can go my own way.

sub205 commented 5 years ago

Guys, calm down!

This project is new and every helping hand is good. Criticism, yes, but please constructive!

What environments are both of you using?

the-grue commented 5 years ago

What environments are both of you using?

Agreed, this information would he helpful for troubleshooting. It would also allow others with a similar environment to confirm they can see the bug as well.

In addition to environment, perhaps the contents of your autoexec.bat and config.sys as well to help rule out external factors.

sub205 commented 5 years ago

i'm currently installing a pcem14 VM with PCMos. The good thing about PCem is, it can emulate many different PCs, from XT up to Pentium. After that i also have access to a lot of older machines, especially pre-2000.

ghost commented 5 years ago

Mine is VMWARE 8 with 4.x hardware compatibility.

But that's probably beside the point. I used plain vanilla kernel.zip files as provided by TSL, and a simple batch file to make it. He's got extra stuff in his boot messages, and it's not my job to understand why his works. He can use the vanila TSL build like me, and see what happens.

image

image

the-grue commented 5 years ago

You didn't install the booster disk did you?

I'll give this a try in virtualbox, as it is the only one I have access to at the moment.

Out of curiosity, what editor is that?

ghost commented 5 years ago

No booster. Just the system disk and auxiliary. Why would I want to install the booster?

ghost commented 5 years ago

The editor is

The Norton Commander Version 5.5, Copyright (C) 1986 - 1998 by Symantec Corporation.

the-grue commented 5 years ago

I was just asking. I installed the booster disk in one of my environments because it was there. Says it improves disk I/O among other things. But it looks like it may have been made by an outside organization and changes a lot of stuff if the backup directory is to be believed. Just wanted to rule that out for possible compatibility issues.

ghost commented 5 years ago

Wild guess, he has the booster installed and that makes the difference.

Whatever patches are made, should work in a minimal environment, without the booster. If people want to use it, they should make sure their patches work both ways -- vanilla too.

Is there source code for all the booster stuff? I've not looked.

the-grue commented 5 years ago

No, it is disk 3 in the SHIPMOS directory.

I would think with the booster installed, a newly built kernel might have problems, not the other way around.

ghost commented 5 years ago

I'm looking now. That's all there is? No source code? I'm sure not adding any binary blob to 5.01 source code.

the-grue commented 5 years ago

Agreed.

ghost commented 5 years ago

I'm guessing we know the problem now. If there is no further activity in a few days, I will close this.

the-grue commented 5 years ago

Confirmed. Virtualbox hang with both the $$MOS.SYS and $$EVAL.SYS kernels under the current source tree using @src153 configuration as well as default config.sys in both cases.

@andrewbird what do your autoexec.bat and config.sys look like? Also, what platform are you having success on?

ghost commented 5 years ago

His boot emits extra messages I don't have. What else could they be, but booster? If that's what they are, then start over and install without booster. Use only the two disks supplied by TSL, which we presumably have matching source code for. Trying to support a binary blob will impede further development of the source code.

The next step is to figure out which patch or patches break the vanilla environment, and either fix them, or get rid of them.

the-grue commented 5 years ago

If @andrewbird is more familiar with the product, he may know how to configure additional settings that are available in the base OS like the serial port connections, multiuser, etc. I'm interested in his build environment to see if success can be replicated before calling for any patch rollbacks. We can see it is broken, I'd like to see how it is still working for him.

ghost commented 5 years ago

Sure, learn whatever you can from him. But any patch that breaks the vanilla setup must be fixed or reversed. It's not a users job to fight programmer error.

stsp commented 5 years ago

@the-grue IIRC you need to re-build everything, including shell and the drivers you use, because the common headers were changed. Or you can try to comment out the line https://github.com/roelandjansen/pcmos386v501/blob/5fa7b36d9dbe10268665cdb3ba8f325505925147/SOURCES/src/kernel/MOSDDINT.ASM#L587 to nullify the effect of the only patch that changed anything at all.

ghost commented 5 years ago

If a header dependency requires a full rebuild, I won't stand in the way of progress. But the binary blob issue remains, and I would like to know what the header change fixed.

the-grue commented 5 years ago

@stsp you do have a point, but I do not believe anything in the latest micropatches changed files under mos5src, which is where the current makeutil and maketerm are built. This could be an oversight and is more reinforcement that we need to do a code audit before applying any more code changes or we will continually run up against these mismatches if someone forgets to update a file in one directory or the other.

If nobody else is interested, I will take this code audit on.

  1. Identify duplicate files
  2. Build all files from my pre-changed repo
  3. Checksum files in last known good distribution (disks 1 and 2 under SHIPMOS)
  4. Checksum files in latest builds to try to identify which files in 2 build files in 3
  5. Create an audit trail for review including any files that are NOT build by the source but are on disks 1/2 in SHIPMOS.
  6. Submit a patch to add an audited build tree to the repo separate from current build paths
ghost commented 5 years ago

I avoid heavy lifting as much as possible. I think we should pay you for that job. Set up a go fund me and I pledge $5 for job completion.

I am not joking, I am totally serious. Sure $5 ain't much, but maybe some others will bid higher. There's gotta be some rich folks around here somewhere. I think Roeland is hoarding the cash.

the-grue commented 5 years ago

Haha! Funny you should mention heavy lifting. Powerlifting is my other hobby.

If you offer a bounty on the work, I am sure other folks would be interested.

stsp commented 5 years ago

@the-grue In this particular case the only suspects for a rebuild are UPDAT501.SYS and shell, as this is all that you use. If you don't load UPDAT501.SYS and see if the problem is still there, you'll reduce the suspect even more.

ghost commented 5 years ago

I see some equipment in the background of your picture. It's not a bounty. I won't pay anyone else. You're the man.

the-grue commented 5 years ago

-- Off Topic - My Powerlifting Hobby -- https://www.youtube.com/jamestsprinkle -- End Off Topic --

the-grue commented 5 years ago

UPDAT501.SYS is an interesting creature.

The version in the jim directory is 007 The version in kernel and mos5src is 113 The version in disk 1 in SHIPMOS is 124 The version on the 9 user disk in IMAGES is 136

I also notice that UPDAT501.SYS is not in the build in any of the makefiles. I'll add it later.

This was a moving target apparently and we don't have the newest one in the code repo.

Not important as long as the one we have continues to work. Since it is a patch, it is possible it will conflict with any non-compatible rebuilds if it uses specific offsets.

Oh, and removing UPDAT501.SYS from config sys lets the new compile boot. So I would bet my statement above is true. Good idea @stsp!

the-grue commented 5 years ago

I also went back and looked at @andrewbird's screenshots and they don't show UPDAT501.SYS being loaded.

ghost commented 5 years ago

https://www.youtube.com/jamestsprinkle

My gmail says the link is suspicious

Be careful with this message. It contains a suspicious link that was used to steal people's personal information

Not sure how a Youtube link can be harmful, but that's what gmail said.

stsp commented 5 years ago

Good idea @stsp!

Not me - @andrewbird told to try w/o UPDAT501.SYS. Since no one listened, I repeated that. His idea was that UPDAT501.SYS requires the kernel completely unmodified. This sounds more plausible to me than your idea about offsets, because the patch in question tries to not change the offsets (adds new variables into a padding space). So if offsets somehow changed, then its a bit unexpected.

ghost commented 5 years ago

As for updat501 and its many versions, that's another mystery. Solving one mystery at a time is hard enough. A full code audit may be exhausting. I recommend not wearing yourself out on a job so large, at this early time. We can tackle these questions one at a time, as we encounter them. Fighting them all at once is probably too much.

One question, if you know the answer -- the kernel.zip file has original file dates inside. When I expand the zip on my DOS system, the originial file dates are still there. That helps, because I know I'm looking at a file that is unchanged. Does github lose that info, or am I just not looking in the right place for it?

How about raw git itself? Surely it can keep original file dates, no? If it can, and that's just a flaw of github, maybe I could be persuaded to learn and use git.

I just got an email offer from servaRICA (I have a dormant account there):

VPS with 2TB disk,UNLIMITED traffic , 1GB ram and dedicated CPU for 10$/month lifetime

That's tempting. It would be even cheaper if a small team pooled resources. We could run our own git. I even have a domain name stashed away for a rainy day.

stsp commented 5 years ago

@the-grue How about unzipping SHIPMOS.ZIP and provide it as a directory? In that case the patch that removes updat501.sys from config.sys would be the instant solution. I do not suppose we need to update SHIPMOS.ZIP itself, it should always contain the original binaries. But there should also be an unzipped form of it, with current builds and updated config.sys.

ghost commented 5 years ago

Not me - @andrewbird told to try w/o UPDAT501.SYS.

Where? I searched this thread in my browser and did not see it. And why would you want to run without it? I see source code for it, updat501.asm.

ghost commented 5 years ago

@the-grue How about unzipping SHIPMOS.ZIP and provide it as a directory? In that case the patch that removes updat501.sys from config.sys would be the instant solution

You're in the wrong place for instant solutions. Throwing patches around like confetti only makes things worse.

the-grue commented 5 years ago

My gmail says the link is suspicious

Be careful with this message. It contains a suspicious link that was used to steal people's personal information

Not sure how a Youtube link can be harmful, but that's what gmail said.

Google owns both gmail and youtube, so maybe it is true.

the-grue commented 5 years ago

@the-grue How about unzipping SHIPMOS.ZIP and provide it as a directory? In that case the patch that removes updat501.sys from config.sys would be the instant solution.

I keep it unzipped in a subdirectory for reference.

I do not suppose we need to update SHIPMOS.ZIP itself, it should always contain the original binaries.

Agreed. These are "point in time" artifacts. Best left alone and used as references.

But there should also be an unzipped form of it, with current builds and updated config.sys.

Along with that one, I keep a reference unzipped directory of the 3 disks in SHIPMOS and an extract of the disk in IMAGES.

Once we package our first distribution, we will include updated configuration files I am sure. If UPDAT501.SYS has outlived its usefulness, we can cast it aside or if it has value, merge the changes into the mainline code.

Just some thoughts.

the-grue commented 5 years ago

Not me - @andrewbird told to try w/o UPDAT501.SYS. Since no one listened, I repeated that.

Missed that. Thanks @andrewbird!

His idea was that UPDAT501.SYS requires the kernel completely unmodified. This sounds more plausible to me than your idea about offsets, because the patch in question tries to not change the offsets (adds new variables into a padding space). So if offsets somehow changed, then its a bit unexpected.

Actually, the offsets within the file can change depending on what assembly language opcodes are generated. You could make what you think is a simple change and go from a 2 byte opcode to a 3 byte opcode. (or the reverse) A single instance of that could throw off the entire file and any offsets the patch is trying to address.

If you look at some of the .PAT files, you'll see they are patching actual locations in an executable file. Also, one of the last steps of building the kernel does "debug $$$$mos.sys < exe2bin.dat" which does a sort of patch or fixup.

the-grue commented 5 years ago

A full code audit may be exhausting. I recommend not wearing yourself out on a job so large, at this early time. We can tackle these questions one at a time, as we encounter them. Fighting them all at once is probably too much.

It needs done. Especially as we start to see more patches being submitted. Modifications to files in one directory but not the other will result in broken everything at some point and the more patches and time that pass will mean more things to unravel.

I'll invest some time in this over the weekend along with starting to scan the documentation. Shouldn't be too bad.

ghost commented 5 years ago

Once we package our first distribution

Is that worthwhile? Are there any likely users, besides us? This is an archaeologists dig. Digging IS the fun.

If UPDAT501.SYS has outlived its usefulness, we can cast it aside

Somebody worked hard to make those patches, no doubt for good reason. Throwing them out just to say we booted a kernel with our own possibly bogus patches, is treating the symptom and ignoring the disease.

or if it has value, merge the changes into the mainline code.

Right, let's not throw out an important part of the legacy, until we understand it well enough to replace it with something better.

ghost commented 5 years ago

It needs done. Especially as we start to see more patches being submitted

There's already too much patching and not enough analysis and discussion. These things need to be explored deeply before slinging patches around. Some people don't like project noise, they want a library quiet zone where they can study in isolation and nobody bothers them. But that's their problem, they are free to unplug and leave at any time.

the-grue commented 5 years ago

Once we package our first distribution

Is that worthwhile? Are there any likely users, besides us? This is an archaeologists dig. Digging IS the fun.

I made an open distribution of Desmet C last year once I got all of the code to build at a standard level. I don't know if anyone is using it, nor do I really care, I just did it for the fun of it and to give folks a level set of tools to play with if they wanted. It is DOS based, doesn't support instructions above the 8086 or maybe 80186 in the assembler, is totally a fossil, but it kept me mentally occupied for awhile.

If UPDAT501.SYS has outlived its usefulness, we can cast it aside

Somebody worked hard to make those patches, no doubt for good reason. Throwing them out just to say we booted a kernel with our own possibly bogus patches, is treating the symptom and ignoring the disease.

or if it has value, merge the changes into the mainline code.

Right, let's not throw out an important part of the legacy, until we understand it well enough to replace it with something better.

Like I said, these were options. If at any point we don't like the direction the project is taking, we can fork and go our own way. That's the beauty of collaborative open source projects. I would fully support merging in the code, but as I pointed out earlier, there are newer versions in the binaries in the repo. Do we go to the trouble to reverse engineer those patches? They are really small, a few kilobytes compiled. Could be fun. But further down the road.

Right now we need to know what we got and start working off the same code base and leave everything else to the reference stacks.

I'm sure there are folks out there following this drama and just waiting for us to have a breakthrough.

ghost commented 5 years ago

I'm sure there are folks out there following this drama and just waiting for us to have a breakthrough.

Billions and billions, no doubt. When it's done, all we need is a time machine, travel back to 1980, and corner the market before TSL is born.

the-grue commented 5 years ago

I'm sure there are folks out there following this drama and just waiting for us to have a breakthrough.

Billions and billions, no doubt. When it's done, all we need is a time machine, travel back to 1980, and corner the market before TSL is born.

But we would do it differently and not violate the 10 byte reserved section in the directory entry!

stsp commented 5 years ago

Offsets within a file changes no matter what you do, true. My point was only that the changes in a public headers were not supposed to change any offsets, so the proposed rebuild was just a shot in the dark. So it seems like no rebuild is needed for anything, and for updat501 it won't help, as it really requires a completely unmodified kernel, as Andrew said. But someone have erased his message! Not sure who and why, but erasing the most meaningfull messages is not going to help.

ghost commented 5 years ago

I did not ignore it. I never saw it, never got any email about it.

Today I will insert extra debug instructions in vanilla kernel, rebuild, and see if that crashes updat501.