rdmark / netatalk-2.x

Netatalk is an Open-Source AFP file server for Linux, NetBSD, and Solaris, capable of serving AppleShare over AppleTalk and TCP/IP with multiple Macintosh or Apple II clients.
Other
52 stars 2 forks source link

Random "Bad resource fork" and "malformed AppleDouble" errors #18

Closed NJRoadfan closed 1 year ago

NJRoadfan commented 1 year ago

Ran across this one trying to netboot GS/OS. Testing indicates that 220401b and newer are effected by this bug. Bug is not present in netatalk-2.2.6. Logs follow:

Jan  4 19:50:07 a2server afpd: randnum/rand2num login: user1
Jan  4 19:50:07 a2server afpd: AFPVersion 2.0 Login by user1
Jan  4 19:50:07 a2server afpd: Volume /srv/A2SERVER/A2FILES use CNID scheme dbd.
Jan  4 19:50:07 a2server afpd: CNID server: localhost:4700
Jan  4 19:50:47 a2server afpd: ad_refresh: nentries 4  eid 0
Jan  4 19:50:47 a2server afpd: ad_header_read(): malformed AppleDouble
Jan  4 19:50:47 a2server afpd: getfilparams(Tool054): bad resource fork
Jan  4 19:50:47 a2server afpd: ad_refresh: nentries 4  eid 0
Jan  4 19:50:47 a2server afpd: ad_header_read(): malformed AppleDouble
Jan  4 19:50:47 a2server afpd: afp_openfork('Hƒì#010¾/Tool054'): ad_open: errno: 5 (Input/output error)
Jan  4 19:50:48 a2server afpd: ad_refresh: nentries 4  eid 0
Jan  4 19:50:48 a2server afpd: ad_header_read(): malformed AppleDouble
Jan  4 19:50:48 a2server afpd: getfilparams(Tool020): bad resource fork
Jan  4 19:50:48 a2server afpd: ad_refresh: nentries 4  eid 0
Jan  4 19:50:48 a2server afpd: ad_header_read(): malformed AppleDouble
Jan  4 19:50:48 a2server afpd: afp_openfork('Hƒì#010¾/Tool020'): ad_open: errno: 5 (Input/output error)
Jan  4 19:51:44 a2server afpd: ad_refresh: nentries 4  eid 0
Jan  4 19:51:44 a2server afpd: ad_header_read(): malformed AppleDouble
Jan  4 19:51:44 a2server afpd: getfilparams(Tool018): bad resource fork
Jan  4 19:51:44 a2server afpd: ad_refresh: nentries 4  eid 0
Jan  4 19:51:44 a2server afpd: ad_header_read(): malformed AppleDouble
Jan  4 19:51:44 a2server afpd: afp_openfork('Hƒì#010¾/Tool018'): ad_open: errno: 5 (Input/output error)

Client throws read errors for above files and halts booting GS/OS. Will continue to test older releases to try and isolate what commit introduced this problem.

rdmark commented 1 year ago

My immediate guess would be this commit https://github.com/rdmark/Netatalk-2.x/commit/b2e75d94745a503fab357bc09381c96cc4d1eccd

It made checks for valid resource fork data in ad_header_read() more stringent. Is there any chance your file system has bad data there?

NJRoadfan commented 1 year ago

Things got worse doing a clean install of GS/OS (done on the host side with A2SERVER). After logging into the server to netboot, it goes straight to "Cannot find Start.GS.OS". It's possible the fix for the CVE has problems handling the AFP ProDOS extensions, but the errors seem random.

rdmark commented 1 year ago

Do the random error go away altogether if you revert the commit?

I think the first step would be to confirm if this is caused by the changes to parse_entries(), or if it always threw a warning that was ignored in ad_header_read()

NJRoadfan commented 1 year ago

Reverted the commit and the errors in the log are no longer being thrown. Files are being loaded properly by the client and GS/OS is now fully booting again. Looks like this patch is still causing some regressions. Granted the code may be fine for use in 3.x, its certainly causing problems in 2.x.

rdmark commented 1 year ago

So reading the code, there were two seemingly relevant places where the logic changed: How nentries is calculated, and how the valid adouble data length is calculated.

Would you be able to run some tests against this branch? It has brought back the previous logic for the former. https://github.com/rdmark/Netatalk-2.x/tree/test-adouble-malformed

NJRoadfan commented 1 year ago

Built and tested that branch and no errors in the log or loading files on the client side.

rdmark commented 1 year ago

That's great! So for some reason, these GS/OS resource forks are sometimes of sizes that are larger than what we get from memcpy(&nentries, buf + ADEDOFF_NENTRIES, sizeof(nentries));

I believe this adjustment of the resource fork data length for the purpose of validation is what makes it work for you:

    if (len + AD_HEADER_LEN > sizeof(ad->ad_data))
        len = sizeof(ad->ad_data) - AD_HEADER_LEN;

If I read this correctly, it truncates the length to the size of ad_data when larger than the buffer, effectively skipping the check for "too large" resource forks. It seems a bit iffy but I can't say with enough certainty if it may mask other valid error states.

rdmark commented 1 year ago

I removed one useless line and made a PR here https://github.com/rdmark/Netatalk-2.x/pull/19

NJRoadfan commented 1 year ago

Looking in the AppleDouble folders, these Toolset files don't appear to have actual data in the resource fork. They are all uniformly 741 bytes and appear to just have a header plus the filetype/creator data.

rdmark commented 1 year ago

Sounds like a type of file that would be highly uncommon in Mac OS, which would explain why we haven't seen many issues there.

Anyhow, I merged the PR to the 2-x branch.

NJRoadfan commented 1 year ago

Some background info. Originally ProDOS did not have forked files. They always had "file types" and "auxiliary types" which are roughly analogous to the Macintosh file type/creator data (and afpd provides translation for). GS/OS later added resource forks and mixed case filenames to the ProDOS file system.

rdmark commented 1 year ago

That makes sense. It's an interesting patchwork of backwards compatibility going on in IIGS land.

Please let me know when you've verified that everything works with the IIGS / A2SERVER integration points. I'll tag a release at that point.

NJRoadfan commented 1 year ago

Tested latest master and working error free. :+1:

rdmark commented 1 year ago

You mean the latest branch-netatalk-2-x right? :)

NJRoadfan commented 1 year ago

Yeah, that one. I'd be surprised if 3.x could netboot an Apple IIgs.

rdmark commented 1 year ago

Indeed.

I think I'll try to get this merged to upstream 2.2 as well.

rdmark commented 1 year ago

@NJRoadfan Would you be able to test that this works in your environment too? https://github.com/Netatalk/Netatalk/pull/182

If it's too much of a pain to test then never mind. :)

NJRoadfan commented 1 year ago

PR #182 is working fine on branch-netatalk-2-2. No errors in log or on client net booting GS/OS.

rdmark commented 1 year ago

Thanks for testing!

NJRoadfan commented 1 year ago

OK, this problem reared its ugly head again.....but I think I found out why. One thing A2SERVER does a lot of is manually resyncing of .AppleDouble folders using a shell script called 'afpsync'.

Shell script in question: https://github.com/NJRoadfan/a2server/blob/njroadfan-main-dev/scripts/tools/afpsync.txt

This re-associates or creates empty forked file data that was copied either under Linux or from a Windows machine using SMB. Something about the AppleDouble data this script creates is not agreeing with the CVE patches.

rdmark commented 1 year ago

That's very curious! Are you aware of what kinds of operations that A2SERVER does on the shared file system that requires this adouble data refresh? For what purposes does it move around the files?

The meat of this logic seems to be sitting in a different script, mkvolinfo.

FWIW, I'm pretty sure Netatalk creates empty adouble metadata and volinfo for any file that is copied onto the shared file system in Linux. In fact, this is a very common usecase for the PiSCSI integration layer, where we download files from the web right onto the shared volume and let Netatalk take care of the rest. We had some bugs early on where afpd crashed when creating the adouble data for new files in certain circumstances, but we fixed all the bugs that we encountered.

Maybe, just maybe, the afpsync script is not needed anymore? It might be worth testing at least.

NJRoadfan commented 1 year ago

What's weird is reverting back the CVE patch and running netatalk seems to fix the corruption in the cnid database. This is why the problem wasn't showing up after installing patched versions of netatalk until AFTER I started moving files around.

rdmark commented 1 year ago

The CVE patch could be introducing a regression. But it might as well be catching invalid adouble data generated by that script.

NJRoadfan commented 1 year ago

afpsync is running the dbd utility (with the -r option) that is included with netatalk to rebuild the database. Running 'dbd -r' manually, it appears that GS/OS is causing CNID database errors to occur on any file GS/OS happens to read.

user1@a2server:~/Netatalk-2.x$ sudo dbd -r /srv/A2SERVER/A2FILES/
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/PRINTER.SETUP'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/PRINTER.SETUP' to AppleDouble file
Bad CNID in adouble file of '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/ATINIT'
Writing CNID data for '/srv/A2SERVER/A2FILES/USERS/USER1/SETUP/ATINIT' to AppleDouble file
...etc...

I'm guessing this is what is causing problems with the CVE patch. This may also be a GS/OS 6.0.4 bug.

NJRoadfan commented 1 year ago

Adding to this.... it appears that GS/OS does some light corruption of adouble data to system files only when netbooted, but that is not the problem.

It seems that the CVE patch caused a regression in the dbd utility. When running dbd-r to force a cnid database rebuild, the stricter adouble checking routines are preventing it from repairing the database. This "corruption" is actually being caused by the 'cppo' script in A2SERVER. This tool manually copies extracted resource fork and filetype/creator data into AppleDouble folders and the A2SERVER scripts use dbd to update the cnid database. It's a hack, but it worked fine before the CVE fixes.

rdmark commented 1 year ago

Good research. We can try to modify dbd to work with the CVE patch. The code lives in etc/cnid_dbd. Might be a tall order though. There's a lot of code. :)

Specifically: The dbd tool's code is in the three cmd_dbd* files. Looking at it though, this CVE patch only touches parse_entries() and ad_header_read() which both are internal to the adouble component, so it's weird that these few changed lines impact dbd...

Are you still just talking about this one CVE patch from above, or have you reverted the other handful that I merged around the same time?

NJRoadfan commented 1 year ago

Commit https://github.com/rdmark/Netatalk-2.x/commit/b2e75d94745a503fab357bc09381c96cc4d1eccd is the one breaking everything. Revert that and dbd functions as normal.

This commit might fix the problem: https://github.com/Netatalk/Netatalk/commit/9d0c21298363e8174cdfca657e66c4d10819507b

Assuming the above fixes the problem, commit https://github.com/rdmark/Netatalk-2.x/commit/6a06656a11d6cd8d3ee33f7e18913119ed9fc08f may not even be needed.

I'll try and clobber something together to test that when I have time. Under ordinary use cases, this problem will not rear it's ugly head with Netatalk. A2SERVER's tools are kind of unique because Ivan needed a way to transfer forked files from native Apple II disk images and compressed archives to Netatalk shared folders while retaining the forked file metadata in a Linux environment.

NJRoadfan commented 1 year ago

PROBLEM SOLVED.

You can safely revert https://github.com/rdmark/Netatalk-2.x/commit/6a06656a11d6cd8d3ee33f7e18913119ed9fc08f, and leave the stricter resource fork checking in place.

The problem was with the cppo utility that came with A2SERVER. It was generating AppleDouble headers with 13 metadata entries when in reality the file was only being created with 9 entries. I think the confusion when the utility was written was that any AppleDouble file generated automatically by netatalk always has 13 metadata entries and the author assumed that was the correct value.

With the enhanced checking being done by the CVE patch, this is properly flagged as a malformed file and netatalk ceases reading the file.

rdmark commented 1 year ago

That is some might impressive troubleshooting work! How did you reach the conclusion that this was the problem?

I've reverted the fix in this fork now. Let's keep an eye on it, and ask for it to be reverted upstream at a later time.

NJRoadfan commented 1 year ago

I sat down and read the AppleDouble file specification. I would have spotted the problem earlier if the error messages in the log were a bit more descriptive.

Jan 4 19:50:47 a2server afpd: ad_refresh: nentries 4 eid 0

What this actually means is that with 4 nentries remaining, the AppleDouble file parser encountered a metadata entry with ID type 0, which is invalid. The cppo tool was writing the value of '0x0D' (13) for the number of metadata entries in the header, but was only writing 9 actual entries. When libatalk read the file prior to the CVE patches, it silently failed and the bug went unnoticed. Since the A2SERVER scripts scans and rebuilds the cnid database and AppleDouble files after each cppo operation, these invalid files were quickly fixed.

IvanExpert commented 1 year ago

This is some excellent sharpshooting!