nasa / CF

The Core Flight System (cFS) CFDP application.
Apache License 2.0
77 stars 45 forks source link

Directory polling does not clean up open directory file descriptors #321

Open dennisafa opened 2 years ago

dennisafa commented 2 years ago

Checklist (Please check before submitting)

Describe the bug CF_CFDP_PlaybackDir is called by CF_CFDP_ProcessingPollingDirectories when the interval for the CF_Poll_t structure has expired. This function will attempt to reopen the directory. If the polling directory does not continuously receive new files to process, then the OS_DirectoryOpen call happens without a corresponding OS_DirectoryClose call, exhausting the amount of available file descriptors and causing this error. CF: failed to open playback directory /cf/dl , error=-14

To Reproduce Steps to reproduce the behavior:

  1. Launch CF with a configured polling directory
  2. Let ticker expire and attempt to reopen directory
  3. Wait until FDs are exhausted and error appears

Expected behavior I expect the directory file descriptor to be cleaned up before each OS_DirectoryOpen call.

Code snips Configuration table used:

CF_ConfigTable_t CF_config_table = {
    10,       /* ticks_per_second */
    33554432, /* max number of bytes per wakeup to calculate r2 recv file crc */
    24,       /* temp local id */
    {{
         32,   /* max number of outgoing messages per wakeup */
         512, /* max number of rx messages per wakeup */
         5,   /* ack timer */
         5,   /* nak timer */
         30,  /* inactivity timer */
         8,   /* ack limit */
         8,   /* nak limit */
         CF_PDU_CMD_MID,
         CF_PDU_TLM_MID,
         512,
         {{1, 0, CF_CFDP_CLASS_1, 21, "/cf/dl", ".", 1}, {0}, {0}, {0}, {0}},
         "", /* throttle sem for channel 1, empty string means no throttle */
         1,
     }},
    1253, /* outgoing_file_chunk_size */
    "/ram",
};

System observed on:

Additional context I added a check in CF_CFDP_PlaybackDir that looks to see if the FD is 0. If it isn't, I close the directory and reopen it. This gets rid of the problem.

Reporter Info Dennis Afanasev, NASA Goddard Code 587

skliper commented 2 years ago

I'm not quite following. The initiate should only happen if the polling directory isn't busy: https://github.com/nasa/CF/blob/80538827ad62f4a12c91259c98001cfbadd47a87/fsw/src/cf_cfdp.c#L1584-L1596

And busy is only set to zero when the directory has been closed and there's no remaining transactions: https://github.com/nasa/CF/blob/80538827ad62f4a12c91259c98001cfbadd47a87/fsw/src/cf_cfdp.c#L1499-L1511

Can you elaborate on the flow that causes multiple directory opens without a close?

skliper commented 2 years ago

Tried to recreate using the main branch and I see a successful close for every open. Any chance there's something else using up the FDs?