nasa / fprime

F´ - A flight software and embedded systems framework
https://fprime.jpl.nasa.gov
Apache License 2.0
10.09k stars 1.32k forks source link

Use static allocation in OSAL #2298

Closed sobkulir closed 6 months ago

sobkulir commented 1 year ago
F´ Version 3.3.2
Affected Component OSAL

Feature Description

Right now, resource-owning OSAL APIs typically have a POINTER_CAST m_handle class member to hold a handle to the underlying OS resource. The resource is typically expected to be dynamically allocated in an initialization method, or in the case of Mutex in the constructor.

Therefore, I'd suggest: 1) Enforce static allocation across OSAL and document places where it's not possible. 2) Do not expect the Mutex to allocate from the heap in the constructor (!!). This maybe should be filed as a bug.

Rationale

Well, dynamic allocation in safety-critical software is a topic in itself, but from what I could understand F Prime tries to enforce doing dynamic allocation only in the initialization phase. However, some of the OSAL types potentially "silently" allocate from the heap during the operational phase, e.g., when reading a directory using Directory::open(), the glibc mallocs memory.

I think such a bug would not be introduced if the design enforced using only statically allocated memory in all OSAL, with exceptions where they can't be avoided (maybe create_pthread?). However, then it should be documented that the given function must only be used during the initialization phase.

This may not be of big concern when a lot of RAM is available, but for deployments with a few hundred kilobytes of RAM this issue becomes relevant. It became relevant for me when implementing a port for Zephyr which often enforces static initialization throughout its APIs.

Suggested implementation

I thought of templating out the handle member, e.g.:

template <typename T>
class Mutex {
    public:
        void lock(); //!<  lock the mutex
        void unLock(); //!<  unlock the mutex

    private:
        T m_handle; //!<  Stored handle to mutex
};

If templates are to be avoided, another option would be using macros, as in the case of TaskIdRepr:

#if defined(TGT_OS_TYPE_VXWORKS) || (FW_BAREMETAL_SCHEDULER == 1)
    typedef int TaskIdRepr;
#elif defined(TGT_OS_TYPE_LINUX) || defined(TGT_OS_TYPE_DARWIN)
    typedef pthread_t TaskIdRepr;
#endif

class TaskId {
  public:
      TaskId();
      ~TaskId();
      bool operator==(const TaskId& T) const;
      bool operator!=(const TaskId& T) const;
      TaskIdRepr getRepr() const;
  private:
      TaskIdRepr id;
};

More details

The affected APIs are: Mutex, Queue, File, Directory, Task, InterruptLock, and WatchdogTimer. In case of the Queue, its size would need to either be determined at compile time or the API would need to allow passing an external buffer.

timcanham commented 1 year ago

@sobkulir To keep the OSAL layer as portable as possible, we made the choice to pass handles this way since some platforms we are targeting (like Linux) only allow this method. The current OSAL layer does not preclude static allocation behind the interface if the platform or homegrown OSAL implementation supports it. For instance, one of our internal projects is implementing a basic file system where the file name is used to look up an entry in a static table, so there is no dynamic memory allocation but the API is preserved. We are avoiding the messiness of templates as much as possible, and macros are problematic since you have to keep adding entries for each new platform that gets ported.

sobkulir commented 1 year ago

@timcanham Thank you for your response!

The current OSAL layer does not preclude static allocation behind the interface if the platform or homegrown OSAL implementation supports it.

Maybe I'm missing something, but I think this is not correct. For example, since you mentioned file system, here's our implementation of File::open() in Zephyr. It's quite hacky because we used the NATIVE_INT_TYPE m_fd meant for file descriptor to hold a pointer.

File::Status File::open(const char* fileName, File::Mode mode, bool include_excl) {
    struct fs_file_t *file = reinterpret_cast<struct fs_file_t *>(k_malloc(sizeof(struct fs_file_t)));

    if (file == nullptr) {
        FW_ASSERT(0);
        return NO_SPACE;
    }

    fs_mode_t flags = modeToZephyrFlags(mode);

    fs_file_t_init(file);
    int rc;

    rc = fs_open(file, fileName, flags);
    if (rc < 0) {
        DEBUG_PRINT("Error opening file %s: %d\n", fileName, rc);
        this->m_lastError = rc;
        k_free(file);
        return NOT_OPENED;
    }

    // Dangerous!
    this->m_fd = reinterpret_cast<NATIVE_INT_TYPE>(file);
    this->m_mode = mode;
    return OP_OK;
}

The full implementation can be found here: sobkulir/fprime-zephyr-app/fprime-zephyr/Os/File.cpp. I'd be happy if you have an idea on how to avoid the malloc.

I totally see your point on avoiding the templates and macros as much as possible. At the same time, I would like to find a compromise because on some OSes this clashes with the rule of Avoid dynamic allocation after initialization, as shown above, or the Directory::open() example from the first comment.

timcanham commented 1 year ago

@sobkulir I agree that in this example, you are doing dynamic allocation in Zephyr, since that is the API Zephyr provides. As a thought exercise, if you weren't using F Prime and your project had a requirement to dynamically open files (say to dump an instrument's data) by name, how would you avoid using this API?

sobkulir commented 1 year ago

@timcanham I would use this API, but allocate the struct fs_file_t handle on stack or globally. I'm not 100% sure, and this is what we still need to check internally, but the subsequent fs_open() doesn't use any dynamic memory and there's a compilation-time parameter in Zephyr on how many files can be opened at the same time. The kernel preallocates the I/O buffers statically. Please let me know if I'm missing something, I'm kind of curious where this will go ngl haha.

timcanham commented 1 year ago

@sobkulir So your objection is not so much the open call, but the k_malloc call (which I missed, sorry!).

sobkulir commented 1 year ago

@timcanham Exactly! It's not a big issue, like the handle is a few bytes. Same for threads, mutex, etc. My main point is that I feel like the API is actively going against the Avoid dynamic allocation after initialization principle.

timcanham commented 1 year ago

Perhaps instead of a template or a bunch of #define macros, we could define it as a typedef that it overridable. Maybe in https://github.com/nasa/fprime/blob/devel/config/FpConfig.h.

sobkulir commented 1 year ago

Haha I was just typing a comment suggesting that -- you mean put it in PlatformTypes.h, right?

timcanham commented 1 year ago

Or that too!

sobkulir commented 1 year ago

Nice! Either would resolve this as far as I can tell. Thanks for the discussion! I don't have the time right now, but in about a month I would be happy to implement this. Do you have a preference over PlatformTypes.h or FpConfig.h? Personally, I feel like PlatformTypes.h fits this use case more, as FpConfig.h is project-specific, while the issue here is platform-specific.

timcanham commented 1 year ago

Good idea! I opened this discussion: https://github.com/nasa/fprime/discussions/2306

timcanham commented 1 year ago

Let's move the trade about where to define it to that discussion.

LeStarch commented 6 months ago

I am going to close this issue. The format used for Os::File and Os::Task seems to work, and as such this is just part of the refactor steps.