Closed NJRoadfan closed 1 year ago
My immediate guess would be this commit https://github.com/rdmark/Netatalk-2.x/commit/b2e75d94745a503fab357bc09381c96cc4d1eccd
It made checks for valid resource fork data in ad_header_read()
more stringent. Is there any chance your file system has bad data there?
Things got worse doing a clean install of GS/OS (done on the host side with A2SERVER). After logging into the server to netboot, it goes straight to "Cannot find Start.GS.OS". It's possible the fix for the CVE has problems handling the AFP ProDOS extensions, but the errors seem random.
Do the random error go away altogether if you revert the commit?
I think the first step would be to confirm if this is caused by the changes to parse_entries()
, or if it always threw a warning that was ignored in ad_header_read()
Reverted the commit and the errors in the log are no longer being thrown. Files are being loaded properly by the client and GS/OS is now fully booting again. Looks like this patch is still causing some regressions. Granted the code may be fine for use in 3.x, its certainly causing problems in 2.x.
So reading the code, there were two seemingly relevant places where the logic changed: How nentries
is calculated, and how the valid adouble data length is calculated.
Would you be able to run some tests against this branch? It has brought back the previous logic for the former. https://github.com/rdmark/Netatalk-2.x/tree/test-adouble-malformed
Built and tested that branch and no errors in the log or loading files on the client side.
That's great! So for some reason, these GS/OS resource forks are sometimes of sizes that are larger than what we get from memcpy(&nentries, buf + ADEDOFF_NENTRIES, sizeof(nentries));
I believe this adjustment of the resource fork data length for the purpose of validation is what makes it work for you:
if (len + AD_HEADER_LEN > sizeof(ad->ad_data))
len = sizeof(ad->ad_data) - AD_HEADER_LEN;
If I read this correctly, it truncates the length to the size of ad_data
when larger than the buffer, effectively skipping the check for "too large" resource forks. It seems a bit iffy but I can't say with enough certainty if it may mask other valid error states.
I removed one useless line and made a PR here https://github.com/rdmark/Netatalk-2.x/pull/19
Looking in the AppleDouble folders, these Toolset files don't appear to have actual data in the resource fork. They are all uniformly 741 bytes and appear to just have a header plus the filetype/creator data.
Sounds like a type of file that would be highly uncommon in Mac OS, which would explain why we haven't seen many issues there.
Anyhow, I merged the PR to the 2-x branch.
Some background info. Originally ProDOS did not have forked files. They always had "file types" and "auxiliary types" which are roughly analogous to the Macintosh file type/creator data (and afpd provides translation for). GS/OS later added resource forks and mixed case filenames to the ProDOS file system.
That makes sense. It's an interesting patchwork of backwards compatibility going on in IIGS land.
Please let me know when you've verified that everything works with the IIGS / A2SERVER integration points. I'll tag a release at that point.
Tested latest master and working error free. :+1:
You mean the latest branch-netatalk-2-x
right? :)
Yeah, that one. I'd be surprised if 3.x could netboot an Apple IIgs.
Indeed.
I think I'll try to get this merged to upstream 2.2 as well.
@NJRoadfan Would you be able to test that this works in your environment too? https://github.com/Netatalk/Netatalk/pull/182
If it's too much of a pain to test then never mind. :)
PR #182 is working fine on branch-netatalk-2-2. No errors in log or on client net booting GS/OS.
Thanks for testing!
OK, this problem reared its ugly head again.....but I think I found out why. One thing A2SERVER does a lot of is manually resyncing of .AppleDouble folders using a shell script called 'afpsync'.
Shell script in question: https://github.com/NJRoadfan/a2server/blob/njroadfan-main-dev/scripts/tools/afpsync.txt
This re-associates or creates empty forked file data that was copied either under Linux or from a Windows machine using SMB. Something about the AppleDouble data this script creates is not agreeing with the CVE patches.
That's very curious! Are you aware of what kinds of operations that A2SERVER does on the shared file system that requires this adouble data refresh? For what purposes does it move around the files?
The meat of this logic seems to be sitting in a different script, mkvolinfo
.
FWIW, I'm pretty sure Netatalk creates empty adouble metadata and volinfo for any file that is copied onto the shared file system in Linux. In fact, this is a very common usecase for the PiSCSI integration layer, where we download files from the web right onto the shared volume and let Netatalk take care of the rest. We had some bugs early on where afpd crashed when creating the adouble data for new files in certain circumstances, but we fixed all the bugs that we encountered.
Maybe, just maybe, the afpsync script is not needed anymore? It might be worth testing at least.
What's weird is reverting back the CVE patch and running netatalk seems to fix the corruption in the cnid database. This is why the problem wasn't showing up after installing patched versions of netatalk until AFTER I started moving files around.
The CVE patch could be introducing a regression. But it might as well be catching invalid adouble data generated by that script.
afpsync is running the dbd utility (with the -r option) that is included with netatalk to rebuild the database. Running 'dbd -r' manually, it appears that GS/OS is causing CNID database errors to occur on any file GS/OS happens to read.
user1@a2server:~/Netatalk-2.x$ sudo dbd -r /srv/A2SERVER/A2FILES/
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/PRINTER.SETUP'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/PRINTER.SETUP' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/ATINIT'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/ATINIT' to AppleDouble file
...etc...
I'm guessing this is what is causing problems with the CVE patch. This may also be a GS/OS 6.0.4 bug.
Adding to this.... it appears that GS/OS does some light corruption of adouble data to system files only when netbooted, but that is not the problem.
It seems that the CVE patch caused a regression in the dbd utility. When running dbd-r to force a cnid database rebuild, the stricter adouble checking routines are preventing it from repairing the database. This "corruption" is actually being caused by the 'cppo' script in A2SERVER. This tool manually copies extracted resource fork and filetype/creator data into AppleDouble folders and the A2SERVER scripts use dbd to update the cnid database. It's a hack, but it worked fine before the CVE fixes.
Good research. We can try to modify dbd
to work with the CVE patch. The code lives in etc/cnid_dbd
. Might be a tall order though. There's a lot of code. :)
Specifically: The dbd
tool's code is in the three cmd_dbd* files. Looking at it though, this CVE patch only touches parse_entries() and ad_header_read() which both are internal to the adouble component, so it's weird that these few changed lines impact dbd
...
Are you still just talking about this one CVE patch from above, or have you reverted the other handful that I merged around the same time?
Commit https://github.com/rdmark/Netatalk-2.x/commit/b2e75d94745a503fab357bc09381c96cc4d1eccd is the one breaking everything. Revert that and dbd functions as normal.
This commit might fix the problem: https://github.com/Netatalk/Netatalk/commit/9d0c21298363e8174cdfca657e66c4d10819507b
Assuming the above fixes the problem, commit https://github.com/rdmark/Netatalk-2.x/commit/6a06656a11d6cd8d3ee33f7e18913119ed9fc08f may not even be needed.
I'll try and clobber something together to test that when I have time. Under ordinary use cases, this problem will not rear it's ugly head with Netatalk. A2SERVER's tools are kind of unique because Ivan needed a way to transfer forked files from native Apple II disk images and compressed archives to Netatalk shared folders while retaining the forked file metadata in a Linux environment.
PROBLEM SOLVED.
You can safely revert https://github.com/rdmark/Netatalk-2.x/commit/6a06656a11d6cd8d3ee33f7e18913119ed9fc08f, and leave the stricter resource fork checking in place.
The problem was with the cppo utility that came with A2SERVER. It was generating AppleDouble headers with 13 metadata entries when in reality the file was only being created with 9 entries. I think the confusion when the utility was written was that any AppleDouble file generated automatically by netatalk always has 13 metadata entries and the author assumed that was the correct value.
With the enhanced checking being done by the CVE patch, this is properly flagged as a malformed file and netatalk ceases reading the file.
That is some might impressive troubleshooting work! How did you reach the conclusion that this was the problem?
I've reverted the fix in this fork now. Let's keep an eye on it, and ask for it to be reverted upstream at a later time.
I sat down and read the AppleDouble file specification. I would have spotted the problem earlier if the error messages in the log were a bit more descriptive.
Jan 4 19:50:47 a2server afpd: ad_refresh: nentries 4 eid 0
What this actually means is that with 4 nentries remaining, the AppleDouble file parser encountered a metadata entry with ID type 0, which is invalid. The cppo tool was writing the value of '0x0D' (13) for the number of metadata entries in the header, but was only writing 9 actual entries. When libatalk read the file prior to the CVE patches, it silently failed and the bug went unnoticed. Since the A2SERVER scripts scans and rebuilds the cnid database and AppleDouble files after each cppo operation, these invalid files were quickly fixed.
This is some excellent sharpshooting!
Ran across this one trying to netboot GS/OS. Testing indicates that 220401b and newer are effected by this bug. Bug is not present in netatalk-2.2.6. Logs follow:
Client throws read errors for above files and halts booting GS/OS. Will continue to test older releases to try and isolate what commit introduced this problem.