FreeRTOS / FreeRTOS-Kernel

FreeRTOS kernel files only, submoduled into https://github.com/FreeRTOS/FreeRTOS and various other repos.
https://www.FreeRTOS.org
MIT License
2.62k stars 1.09k forks source link

Improve POSIX port functionality #914

Closed cmorganBE closed 7 months ago

cmorganBE commented 9 months ago

Posix port improvements.

I'm interested in feedback on these changes. You can see the test applications here, https://github.com/cmorganBE/freertos_posix

Description

The posix port had a handful of issues that were making it difficult to use, including:

These commits build upon the posix port to resolve these issues.

Test Steps

Checklist:

Related Issue

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

cmorganBE commented 9 months ago

@Skptak can you re-initiate the workflows? I was waiting for them to complete and just realized they don't run without approval....

Skptak commented 9 months ago

Hey @cmorganBE, thanks for moving that idle hook call to work the way you have it set up now! And thanks for responding to the comments in your PR 😄

There are a few FreeRTOS Style Guide guide differences in your PR than what we use. For example the static pthread_t timer_tick_thread; would be called something closer to static Thread_t pxTimerTask. Then these files should be run using uncrustify version .69.

Where I was wondering if you'd be fine with me adding a few commits to your PR to address these problems? If you want to do it yourself that's fine as well!

I appreciate the effort you've put into this PR so far, and thanks again for your contribution!

cmorganBE commented 9 months ago

Hi @Skptak . It's no trouble. We've got a ton of code that would be helped by having this working better and its been interesting learning how the port works and more about freertos internals.

If you want to add commits and feel they align with the spirit of the PR feel free to do so.

On the timer tick thread, this isn't a FreeRTOS managed thread per-se, that's why I didn't use Thread_t there, there isn't much use for most of the stuff in that struct. Renaming it to pxTimerTask might make sense but again, it's not a freeRTOS task at the moment. I think it wouldn't make sense to make it a FreeRTOS task either, in that case the task would end up being suspended and I think the signals that serve as interrupts in this approach wouldn't be generated correctly.

Hey @cmorganBE, thanks for moving that idle hook call to work the way you have it set up now! And thanks for responding to the comments in your PR 😄

There are a few FreeRTOS Style Guide guide differences in your PR than what we use. For example the static pthread_t timer_tick_thread; would be called something closer to static Thread_t pxTimerTask. Then these files should be run using uncrustify version .69.

Where I was wondering if you'd be fine with me adding a few commits to your PR to address these problems? If you want to do it yourself that's fine as well!

I appreciate the effort you've put into this PR so far, and thanks again for your contribution!

chinglee-iot commented 8 months ago

@cmorganBE I am also running the test you shared in this repository https://github.com/cmorganBE/freertos_posix. It probably take some time for me to review some of the changes. Once I finish, I will update in this PR again.

cmorganBE commented 8 months ago

@chinglee-iot I've updated to fix the smp build and fixed a formatting so the build error and formatting error should be resolved.

The posix test repo has also been updated to point to my fork with the latest changes here.

I'm not sure why the cmock tests aren't passing and could use some pointers there because afaict nothing changed in the core should result in any new failures.

I'm having trouble figuring out how to run the tests here but having trouble figuring out how to do so locally, pointers on why they are failing and how to run them would be welcome.

cmorganBE commented 8 months ago

@chinglee-iot applied the list changes to drop the requirement to enable tracing, thank you for making the changes in your PR. Also dropped off the clean restart changes so this is just the posix related ones for now. Should be ready for review.

codecov[bot] commented 8 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (529de56) 93.42% compared to head (7a4785d) 93.42%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #914 +/- ## ======================================= Coverage 93.42% 93.42% ======================================= Files 6 6 Lines 3195 3195 Branches 887 887 ======================================= Hits 2985 2985 Misses 103 103 Partials 107 107 ``` | [Flag](https://app.codecov.io/gh/FreeRTOS/FreeRTOS-Kernel/pull/914/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=FreeRTOS) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/FreeRTOS/FreeRTOS-Kernel/pull/914/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=FreeRTOS) | `93.42% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=FreeRTOS#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

cmorganBE commented 7 months ago

@chinglee-iot any updates on this PR or actions you need me to take? We are starting a project and want to make use of the posix port for testing and I'd prefer to use mainline freertos vs. a fork for this and the reinitialize changes.

cmorganBE commented 7 months ago

Ping @Skptak @chinglee-iot re-review on this PR? Looks like all tests are passing now with the latest merge of main.

chinglee-iot commented 7 months ago

@cmorganBE ( Edited for incorrect information ) I am creating a PR to propose re-initialise internal variables in this PR #944 as well as some of your fix in this PR and is still under review. The discussion may still take some time.

I didn't merge the following fixes in this PR to PR #944:

With timer tick thread:

~/pr_review/dev/freertos_posix/build$ make test
Running tests...
Test project /home/ubuntu/pr_review/dev/freertos_posix/build
      Start  1: simple_task_end
 1/10 Test  #1: simple_task_end .....................   Passed    1.30 sec
      Start  2: simple_task_end_valgrind
( blocking here for more than 5 minutes )

With interval timer:

Running tests...
Test project /home/ubuntu/pr_review/dev/freertos_posix/build
      Start  1: simple_task_end
 1/10 Test  #1: simple_task_end .....................   Passed    1.20 sec
      Start  2: simple_task_end_valgrind
 2/10 Test  #2: simple_task_end_valgrind ............   Passed    1.89 sec
      Start  3: simple_task_end_2
 3/10 Test  #3: simple_task_end_2 ...................   Passed    1.00 sec
      Start  4: simple_task_end_2_valgrind
 4/10 Test  #4: simple_task_end_2_valgrind ..........   Passed    1.73 sec
      Start  5: simple_static_task_end
 5/10 Test  #5: simple_static_task_end ..............   Passed    1.20 sec
      Start  6: simple_static_task_end_valgrind
 6/10 Test  #6: simple_static_task_end_valgrind .....   Passed    1.88 sec
      Start  7: simple_static_task_end_2
 7/10 Test  #7: simple_static_task_end_2 ............   Passed    1.00 sec
      Start  8: simple_static_task_end_2_valgrind
 8/10 Test  #8: simple_static_task_end_2_valgrind ...   Passed    1.73 sec
      Start  9: restart_scheduler
 9/10 Test  #9: restart_scheduler ...................   Passed    5.01 sec
      Start 10: restart_scheduler_valgrind
10/10 Test #10: restart_scheduler_valgrind ..........   Passed    5.74 sec

100% tests passed, 0 tests failed out of 10

Total Test time (real) =  22.39 sec

My environment:

ubuntu@ip-:~/pr_review/dev/freertos_posix/build$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ubuntu@ip-:~/pr_review/dev/freertos_posix/build$ valgrind --version
valgrind-3.15.0

We would like to clarify the requirement and alternative approaches to the questions above. If it is clear to us that these changes are required and no alternative approach can meet the requirement, we would like to merge the fixes in this PR. Thank you for your time to discuss with us.

cmorganBE commented 7 months ago

@cmorganBE ( Edited for incorrect information ) I am creating a PR to propose re-initialise internal variables in this PR #944 as well as some of your fix in this PR and is still under review. The discussion may still take some time.

I didn't merge the following fixes in this PR to PR #944:

The first work around is flawed as you can get a signal before the new thread can block them. The second work around is imo a silly additional effort to ask of the end user. I'd have to modify all threads created, and what if some of them were created by 3rd party libraries of which code I had no access? I'd be stuck with those threads able to deadlock...

If we use directed signals we avoid affecting ANY outside threads. Users of the posix native port don't have to know about its usage of signals, or how to block them etc, it just works. Imo this is the value of the change, it causes the posix port to behave in a more isolated manner.

If we can make the native posix port not require the user to jump through extra hoops we should do so.

Another problem is that timer tick thread blocks for a long time in my environment when running with valgrind. It can be a problem in my environment. Can you share your environment setup?

I see the same thing here. I dug in a bit and this is a byproduct of valgrind's signal overhead. For valgrind builds I've been reducing the system timer tick rate from 1000hz to 10hz. This should be the same issue seen with the prior implementation as its related to valgrind signal overhead and not the mechanism to dispatch them.

With timer tick thread:

~/pr_review/dev/freertos_posix/build$ make test
Running tests...
Test project /home/ubuntu/pr_review/dev/freertos_posix/build
      Start  1: simple_task_end
 1/10 Test  #1: simple_task_end .....................   Passed    1.30 sec
      Start  2: simple_task_end_valgrind
( blocking here for more than 5 minutes )

With interval timer:

Running tests...
Test project /home/ubuntu/pr_review/dev/freertos_posix/build
      Start  1: simple_task_end
 1/10 Test  #1: simple_task_end .....................   Passed    1.20 sec
      Start  2: simple_task_end_valgrind
 2/10 Test  #2: simple_task_end_valgrind ............   Passed    1.89 sec
      Start  3: simple_task_end_2
 3/10 Test  #3: simple_task_end_2 ...................   Passed    1.00 sec
      Start  4: simple_task_end_2_valgrind
 4/10 Test  #4: simple_task_end_2_valgrind ..........   Passed    1.73 sec
      Start  5: simple_static_task_end
 5/10 Test  #5: simple_static_task_end ..............   Passed    1.20 sec
      Start  6: simple_static_task_end_valgrind
 6/10 Test  #6: simple_static_task_end_valgrind .....   Passed    1.88 sec
      Start  7: simple_static_task_end_2
 7/10 Test  #7: simple_static_task_end_2 ............   Passed    1.00 sec
      Start  8: simple_static_task_end_2_valgrind
 8/10 Test  #8: simple_static_task_end_2_valgrind ...   Passed    1.73 sec
      Start  9: restart_scheduler
 9/10 Test  #9: restart_scheduler ...................   Passed    5.01 sec
      Start 10: restart_scheduler_valgrind
10/10 Test #10: restart_scheduler_valgrind ..........   Passed    5.74 sec

100% tests passed, 0 tests failed out of 10

Total Test time (real) =  22.39 sec

My environment:

ubuntu@ip-:~/pr_review/dev/freertos_posix/build$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ubuntu@ip-:~/pr_review/dev/freertos_posix/build$ valgrind --version
valgrind-3.15.0
  • portHAS_IDLE_HOOK : As I replied in this link. This hook is probably useful when vTaskEndScheduler is called in non-FreeRTOS task. Is it a requirement to your application to call vTaskEndScheduler in a non-FreeRTOS task thread? If vTaskEndScheduler is called in a FreeRTOS task or SIGALARM is handled by current running FreeRTOS task, idle task can be cancelled by the port. If not, can we make use of vApplicationIdleHook for the same purpose?

At the moment we are only calling vTaskEndScheduler from a freertos thread. I this it is a reasonable requirement to say that "vTaskEndScheduler must be called from a freertos thread".

I'm not following the vApplicationIdleHook. Are you saying that the application idle hook could be used to call vTaskEndScheduler to aid in not requiring a separate task from which to call it? I can see that as working and that makes sense.

We would like to clarify the requirement and alternative approaches to the questions above. If it is clear to us that these changes are required and no alternative approach can meet the requirement, we would like to merge the fixes in this PR. Thank you for your time to discuss with us.

cmorganBE commented 7 months ago

@chinglee-iot I saw some mention of interrupts being disabled and the timer tick causing issues but maybe that went away during an edit? If that is an issue we could have the port use an atomic variable to determine if a tick should be dispatched, and disabling interrupts call could flip that atomic, timer thread checks that and won't dispatch it if it isn't set to true.

chinglee-iot commented 7 months ago

@chinglee-iot I saw some mention of interrupts being disabled and the timer tick causing issues but maybe that went away during an edit? If that is an issue we could have the port use an atomic variable to determine if a tick should be dispatched, and disabling interrupts call could flip that atomic, timer thread checks that and won't dispatch it if it isn't set to true.

Yes. I edited the reply cause this information is wrong and no need to be discussed. In this PR, SIGALARM will only be sent to the thread which is running the FreeRTOS current task.

chinglee-iot commented 7 months ago

@cmorganBE Thank you for your reply. Timer tick thread makes sense to me. Just valgrind signal handling doesn't work with it very well. I intend to approve timer tick thread as it has more benefit. Some format NIT suggestion:

At the moment we are only calling vTaskEndScheduler from a freertos thread. I this it is a reasonable requirement to say that "vTaskEndScheduler must be called from a freertos thread".

I'm not following the vApplicationIdleHook. Are you saying that the application idle hook could be used to call vTaskEndScheduler to aid in not requiring a separate task from which to call it? I can see that as working and that makes sense.

Yes, this is one of the way how you can use vApplicationIdleHook.

From your previous description. The idle hook introduced in this PR tries to create a cancellation point for the idle task. Otherwise, the idle task will be running in a loop and can't be cancelled by main thread.

I would like to discuss the following scenarios where the vTaskEndScheduler is called:

Since vTaskEndScheduler is only called from a FreeRTOS task, I don't think portHAS_IDLE_HOOK is necessary to cancel the idle task.

cmorganBE commented 7 months ago

@chinglee-iot I also don't see any test failures if I disable the cancellation point. I'll drop the port idle change off in the next push of this branch. I can't help but think that if the idle thread isn't waiting on an event that we'd have an issue where it never hit a cancellation point. Perhaps that behavior was changed by the directed signals. In any case I'll drop it and keep a copy of it here locally just in case.

I'll also update the comments to use c style comments, and adjust various names as you've mentioned and across the changes made.

cmorganBE commented 7 months ago

@chinglee-iot addressed naming, the port idle change, comments, and a few other minor formatting things with this latest push.

chinglee-iot commented 7 months ago

@cmorganBE I created a PR to your branch to fix the following:

Please help to take a look. I will also revert the posix change in #944.

Some update about re-initialize PR. We intend to public the re-initialization ( will be rename to reset state ) API in this PR. User will need to call these APIs before next time restart scheduler.

Pseudo example code :

/* First time start scheduler. */
vTaskStartScheduler();

/* End scheduler in a FreeRTOS task. */
vTaskEndScheduler();

/* Calling FreeRTOS reset state API before next time restart scheduler. */
#if ( configUSE_CO_ROUTINES == 1 )
{
    vCoRoutineResetState();
}
#endif
#if ( configUSE_TIMERS == 1 )
{
    vTimerResetState();
}
#endif /* #if ( configUSE_TIMERS == 1 ) */
#if ( configSUPPORT_DYNAMIC_ALLOCATION == 1 )
    vPortHeapResetState();
#endif
vTaskResetState();

/* Restart scheduler. */
vTaskStartScheduler();

/* End scheduler in a FreeRTOS task. */
vTaskEndScheduler();
cmorganBE commented 7 months ago

@chinglee-iot only had one comment on the pthread stack stuff on the other PR.

cmorganBE commented 7 months ago

@chinglee-iot merged and re-added the stack size commit, let me know if it looks ok. Rewrote the commit log there to improve the arguments for the change.

cmorganBE commented 7 months ago

@chinglee-iot all tests except the restart ones pass from git@github.com:cmorganBE/freertos_posix.git

sonarcloud[bot] commented 7 months ago

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud