mach-kernel / cadius

A maintained fork of BrutalDeluxe's Cadius ProDOS disk imaging utility (used for making Apple II disk images).
GNU General Public License v3.0
31 stars 9 forks source link

ADDFILE will create first block as sparse, confusing ProDOS #42

Open inexorabletash opened 3 months ago

inexorabletash commented 3 months ago

Repro:

# First let's create a file with the first block all zeros
dd if=/dev/zero of=SPARSE#060000 bs=1 count=512
echo "Hello World" >> SPARSE#060000

# Now let's create a volume and add the file
cadius CREATEVOLUME tmp.po TEST 800kb
cadius ADDFILE tmp.po /TEST SPARSE#060000
cadius CATALOG tmp.po
# note 1 sparse block
cadius CHECKVOLUME tmp.po
# note no errors

Now mount a ProDOS-8 system disk (2.0.3, 2.4.3, shouldn't matter) and the tmp.po disk and run BASIC.SYSTEM:

DELETE /TEST/SPARSE

Back to the shell:

cadius CHECKVOLUME tmp.po
# => Block 0000 is declared FREE in the Bitmap, but used by Boot

Ooops! For even more fun, go back to BASIC.SYSTEM:

10 HOME
SAVE /TEST/HELLO
# ProDOS crashes with RESTART SYSTEM-$0C

Per ProDOS-8 Technical Reference Manual:

By the Way: The first data block of a standard file, be it a seedling, sapling, or tree file, is always allocated. Thus there is always a data block to be read in when the file is opened.

So it appears Cadius is violating this filesystem assumption, and it makes ProDOS very sad.

inexorabletash commented 3 months ago

Also, CHECKVOLUME should flag this issue, and maybe mention a file's sparse blocks in its verbose mode.

inexorabletash commented 3 months ago

I think this will fix ADDFILE, but doesn't help CHECKVOLUME.

diff --git a/Src/Prodos_Add.c b/Src/Prodos_Add.c
index 483669c..5028640 100644
--- a/Src/Prodos_Add.c
+++ b/Src/Prodos_Add.c
@@ -565,7 +565,9 @@ static void ComputeFileBlockUsage(struct prodos_file *current_file)
   for(i=0; i<current_file->data_length; i+=BLOCK_SIZE)
     {
       /* Recherche les plages de 0 */
-      if(i+BLOCK_SIZE <= current_file->data_length)
+      if(i == 0)
+        result = 1; /* Block 0 ne doit pas être Sparse */
+      else if(i+BLOCK_SIZE <= current_file->data_length)
         result = memcmp(&current_file->data[i],empty_block,BLOCK_SIZE);
       else
         result = memcmp(&current_file->data[i],empty_block,current_file->data_length-i);
@@ -586,7 +588,9 @@ static void ComputeFileBlockUsage(struct prodos_file *current_file)
       for(i=0; i<current_file->resource_length; i+=BLOCK_SIZE)
         {
           /* Recherche les plages de 0 */
-          if(i+BLOCK_SIZE <= current_file->resource_length)
+          if(i == 0)
+            result = 1; /* Block 0 ne doit pas être Sparse */
+          else if(i+BLOCK_SIZE <= current_file->resource_length)
             result = memcmp(&current_file->resource[i],empty_block,BLOCK_SIZE);
           else
             result = memcmp(&current_file->resource[i],empty_block,current_file->resource_length-i);
@@ -883,7 +887,9 @@ static WORD CreateSaplingContent(struct prodos_image *current_image, struct prod
   for(i=0,j=1,k=0; i<data_length; i+=BLOCK_SIZE,k++)
     {
       /* Recherche les plages de 0 */
-      if(i+BLOCK_SIZE <= data_length)
+      if(i == 0)
+        is_empty = 0; /* Block 0 ne doit pas être Sparse */
+      else if(i+BLOCK_SIZE <= data_length)
         is_empty = !memcmp(&data[i],empty_block,BLOCK_SIZE);
       else
         is_empty = !memcmp(&data[i],empty_block,data_length-i);
@@ -1006,7 +1012,9 @@ static WORD CreateTreeContent(struct prodos_image *current_image, struct prodos_
   for(i=0,j=index_data,k=0,l=0; i<data_length; i+=BLOCK_SIZE,k++)
     {
       /* Recherche les plages de 0 */
-      if(i+BLOCK_SIZE <= data_length)
+      if(i == 0)
+        is_empty = 0; /* Block 0 ne doit pas être Sparse */
+      else if(i+BLOCK_SIZE <= data_length)
         is_empty = !memcmp(&data[i],empty_block,BLOCK_SIZE);
       else
         is_empty = !memcmp(&data[i],empty_block,data_length-i);
inexorabletash commented 3 months ago

Here's a hacky warning that can be added. Not scoped to CHECKVOLUME but maybe it's better this way?

diff --git a/Src/Dc_Prodos.c b/Src/Dc_Prodos.c
index c979036..8dd6c29 100644
--- a/Src/Dc_Prodos.c
+++ b/Src/Dc_Prodos.c
@@ -1090,6 +1090,9 @@ static int GetFileDataResourceSize(struct prodos_image *current_image, struct fi
             if(tab_block[i] == 0)
               file_entry->nb_sparse++;

+          if(tab_block[0] == 0)
+            logf_error("  Warning : Block 0 is sparse in file: '%s'.\n", file_entry->file_path);
+
           /* Table des blocs utilisés */
           file_entry->tab_used_block = BuildUsedBlockTable(nb_block,tab_block,nb_index_block,tab_index_block,&file_entry->nb_used_block);
           if(file_entry->tab_used_block == NULL)
@@ -1180,6 +1183,8 @@ static int GetFileDataResourceSize(struct prodos_image *current_image, struct fi
             if(tab_block[i] == 0)
               file_entry->nb_sparse++;

+          // TODO: Warn if data fork block 0 is sparse?
+
           /* Table des blocs utilisés (Data) */
           tab_used_block_data = BuildUsedBlockTable(nb_block,tab_block,nb_index_block,tab_index_block,&nb_used_block_data);
           if(tab_used_block_data == NULL)
@@ -1250,6 +1255,8 @@ static int GetFileDataResourceSize(struct prodos_image *current_image, struct fi
             if(tab_block[i] == 0)
               file_entry->nb_sparse++;

+          // TODO: Warn if resource fork block 0 is sparse?
+
           /* Table des blocs utilisés (Resource) */
           tab_used_block_resource = BuildUsedBlockTable(nb_block,tab_block,nb_index_block,tab_index_block,&nb_used_block_resource);
           if(tab_used_block_resource == NULL)
inexorabletash commented 3 months ago

Note that adding code specifically to CHECKVOLUME would be more intrusive as the table of all a file's blocks (tab_block in the above code) is not persisted outside of the GetFileDataResourceSize() call. The file_descriptive_entry struct used in CHECKVOLUME only holds the used blocks (tab_used_block) so the data is not present. We could persist the whole list but that's a lot of not fun C memory management; we could alternately just add a flag to the struct for just this case. But the simple warning seems okay to me.

inexorabletash commented 3 months ago

One thing I'm not sure about - for GS/OS "extended" files - with a data fork and a resource fork - both forks are managed as separate effectively separate seed/sapling/tree. Is the first block of either of those permitted to be "sparse"? I'm guessing no, but I haven't found anything definitive yet in APDA docs.

mach-kernel commented 3 months ago

Hi @inexorabletash! Thanks very much for the detailed report!! Would you be so kind as to pull request the patches? I think the rationale about the warnings / first block assumption make sense.

One thing I'm not sure about - for GS/OS "extended" files - with a data fork and a resource fork - both forks are managed as separate effectively separate seed/sapling/tree. Is the first block of either of those permitted to be "sparse"? I'm guessing no, but I haven't found anything definitive yet in APDA docs.

I did some sleuthing in the toolbox reference and found this:

image

I am confused about the "empty" semantics (see CreateResourceFile). I read this as: an empty resource fork implies a valid header that points to a valid resource map with no entries. In the case of a sparse block would/should it read the 0s and determine that the header+map are invalid? Would it crash?

My guess is that it isn't inherently prohibited but would probably not be a valid usage in the eyes of the resource manager.

inexorabletash commented 3 months ago

PR added!

I chatted with @fadden on Slack and he said he'd had similar issues in CiderPress (https://github.com/fadden/ciderpress/issues/49) which he fixed, and for "extended" files added some similar checks to CiderPress2 (here, here) so I think at least these tools are aligned with the proposed changes here.