nasa / cFE

The Core Flight System (cFS) Core Flight Executive (cFE)
Apache License 2.0
413 stars 202 forks source link

Possibly outdated OS_initfs/OS_mkfs logic for creating /ram #589

Open thewonderidiot opened 4 years ago

thewonderidiot commented 4 years ago

Describe the bug cFE 6.7.0+ still retain the original logic for setting up /ram. For poweron resets it works fine, however I think the logic may now be outdated for processor resets. On a processor reset, the flow is:

  1. Attempt to initialize the filesystem with OS_initfs().
  2. If that fails, format the filesystem with OS_mkfs().

This logic may have made sense for older OSALs, but I don't see how it is helpful with OSAL 5.0+. In the new OSAL, both OS_initfs() and OS_mkfs() call OS_FileSys_Initialize(), with only the final argument should_format differing. Both call OS_FileSysStartVolume_Impl() to initialize the ram disk. For OS_initfs(), this is the only Impl function called. In the OS_mkfs() case, should_format is only checked if OS_FileSysStartVolume_Impl() succeeded, and only if it did, then OS_FileSysFormatVolume_Impl() is called.

Because the two OSAL functions follow exactly the same path up until the check for should_format, it seems to me that if OS_initfs() fails, then OS_mkfs() cannot possibly succeed. In other worse, if OS_initfs() fails, then the cFE will inevitably panic instead of actually attempting to reformat /ram.

To Reproduce n/a

Expected behavior I think the intended logic in the cFE start up was to attempt to initialize an existing /ram filesystem on a processor reset, but reformat it if that failed and continue to boot. With the current OSAL, the only way I can see for that to work would be something like this:

  1. Initialize the filesystem with OS_initfs(). If that fails, panic.
  2. Attempt to mount the filesystem with OS_mount().
  3. If that fails, use OS_rmfs() to remove it, and then call OS_mkfs().
  4. Attempt again to mount with OS_mount(). If that fails, panic.

The other possible change for this to work as I think it is intended would be to make OS_initfs() fail if given an invalid filesystem. But as written for that to happen, OS_FileSysStartVolume_Impl() would need to fail, which would also make OS_mkfs() always fail.

Code snips cFE RAM disk creation on processor reset:

      RetStatus = OS_initfs((void *)RamDiskMemoryAddress, "/ramdev0", "RAM", CFE_PLATFORM_ES_RAM_DISK_SECTOR_SIZE, CFE_PLATFORM_ES_RAM_DISK_NUM_SECTORS );
      if ( RetStatus != OS_FS_SUCCESS )
      {
         CFE_ES_WriteToSysLog("ES Startup: Error Initializing Volatile(RAM) Volume. EC = 0x%08X\n",(unsigned int)RetStatus);
         CFE_ES_WriteToSysLog("ES Startup: Formatting Volatile(RAM) Volume.\n");

         RetStatus = OS_mkfs((void *)RamDiskMemoryAddress, "/ramdev0", "RAM", CFE_PLATFORM_ES_RAM_DISK_SECTOR_SIZE, CFE_PLATFORM_ES_RAM_DISK_NUM_SECTORS );
         if ( RetStatus != OS_SUCCESS )
         {
            CFE_ES_WriteToSysLog("ES Startup: Error Creating Volatile(RAM) Volume. EC = 0x%08X\n",(unsigned int)RetStatus);

OS_FileSys_Initialize() logic:

        return_code = OS_FileSysStartVolume_Impl(local_id);

        if (return_code == OS_SUCCESS)
        {
            /*
             * The "mkfs" call also formats the device.
             * this is the primary difference between mkfs and initfs.
             */
            if (should_format)
            {
                return_code = OS_FileSysFormatVolume_Impl(local_id);
            }

System observed on:

Additional context It's very possible that I'm misunderstanding something here, in which case I apologize in advance for the noise!

Reporter Info Mike Stewart, Capella Space Corporation

jphickey commented 4 years ago

I do agree there is a disconnect here. Even in OSAL 4.2.1a the OS_mkfs and OS_initfs differed only in whether they formatted the newly-allocated ramdisk. I concur that it if OS_initfs fails then OS_mkfs will surely fail too, because it does the same exact steps plus an extra one (format).

All that being said, the approach I would take is to deprecate the OS_initfs/OS_mkfs calls. They don't really make sense for cross-platform applicability. The PSP should handle setting up the filesystems before even starting CFE and it can more easily do this job in a platform-specific manner.

My preference would be to remove this logic from ES and have the PSP set up all filesystems prior to the system starting. ES would just use the file systems however the PSP set them up.

@acudmore might have other thoughts on this?

skliper commented 3 years ago

@acudmore @jphickey any updates on this in the context of Caelum? If not are we OK deferring it?

jphickey commented 3 years ago

I think it is sort of broken but it has been broken for quite some time, even before CFE 6.5/ OSAL 4.2. To fix it probably requires some finer-grained OSAL APIs - because mkfs/initfs do too much. (And fixing that part should be done with some new ID-based APIs in nasa/osal#719)

So while not exactly correct - AFAICT it hasn't been an issue that has actually interfered with operations at all. If OS_initfs() fails after processor reset the ultimate path is to reboot again in a poweron reset mode. It's not ideal and could be cleaned up for sure, but given the Caelum timeframe maybe we continue living with it for now.