zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.6k stars 6.49k forks source link

undefined reference to _open when using newlib with Zephyr3.3.99 and above #66132

Open pnchatterji opened 10 months ago

pnchatterji commented 10 months ago

Hello,

I am building an application based on a Nordic platform (nrf52840) using NRF Connect SDK (NCS) which encapsulates Zephyr. This application works fine with NCS 2.3.0 (Zephyr 3.2.99), but the build fails with NCS 2.4.2 and above (Zephyr 3.3.99 and above). I use the following config settings: CONFIG_NEWLIB_LIBC=y CONFIG_FILE_SYSTEM=y CONFIG_FILE_SYSTEM_LITTLEFS=y CONFIG_POSIX_API=y CONFIG_NEWLIB_LIBC=y CONFIG_NEWLIB_LIBC_NANO=n

I use fopen() in my code for opening files (it is a customer requirement to use the POSIX API for file handling). I get the following build error:

D:/ncs2/toolchains/c57af46cb7/opt/zephyr-sdk/arm-zephyr-eabi/arm-zephyr-eabi/lib/thumb/v7e-m/nofp\libc.a(lib_a-openr.o): in function _open_r': openr.c:(.text._open_r+0x14): undefined reference to_open'

Note that if I set
CONFIG_NEWLIB_LIBC=n CONFIG_NEWLIB_LIBC_NANO=n i.e. if I use the default Pico clib, it works fine. I repeat that this is an issue only if I use Zephyr 3.3.99 and higher. It works fine with Zephyr 3.2.99 and lower. Clearly the POSIX support for Newlibc is broken .

NOTE: I have posted this issue on the Nordic support forum, and Nordic engineers have confirmed this issue, but told me they can't fix it. They recommended that I post this on the Zephyr support forum. Here is the link: https://devzone.nordicsemi.com/f/nordic-q-a/105594/undefined-reference-to-_open-when-using-libc-with-ncs-2-4-2-and-above

To reproduce the bug: clone the Hello world sample.

Modify main.c like so:

include

int main(void) { printf("Hello World!\n"); FILE *f = fopen("posix.txt","a"); return 0; }

modify prj.conf like so:

CONFIG_NEWLIB_LIBC=y CONFIG_NEWLIB_LIBC_NANO=n CONFIG_FILE_SYSTEM=y CONFIG_FILE_SYSTEM_LITTLEFS=y CONFIG_FLASH=y CONFIG_FLASH_MAP=y CONFIG_FLASH_PAGE_LAYOUT=y CONFIG_POSIX_API=y Create a build configuration for the board nrf52840dk_nrf52840 You should get the above error.

Toolchain Used: I use the default toolchain installed by NCS for their respective versions.

Appreciate if this bug could be fixed at an early date. Thanks regards PNC

github-actions[bot] commented 10 months ago

Hi @pnchatterji! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

henrikbrixandersen commented 10 months ago

Thank you for reporting this. However - unless you are able to reproduce this issue with upstream Zephyr main - please report issues with the nRF Connect SDK (NCS) on the Nordic Semiconductor DevZone.

pnchatterji commented 10 months ago

Hi Henrik, I have confirmed it with the latest dev version of NCS, NCS 2.5.99-dev 1, which corresponds to Zephyr 3.4.99 . I am not familiar with using Zephyr outside of the NCS infrastructure. As stated previously, I HAD posted this to Nordic DevZone, but they directed me to you. Looks like we users are caught in a ping-pong game between Zephyr and Nordic!

pnchatterji commented 10 months ago

Hi Henrik,

I posted your response back on Nordic DevZone, and they have confirmed that the problem persists even with the upstream Zephyr head. You can look at their full detailed response, with build log etc, on this link: https://devzone.nordicsemi.com/f/nordic-q-a/105594/undefined-reference-to-_open-when-using-libc-with-ncs-2-4-2-and-above/458944

If you wish me to copy their response rather then send a link, let me know. thanks regards PNC

morinv-actif commented 9 months ago

Priority should be set to high and critical, as the entire Nordic customer base is suffering. The project delivery has been put on hold because of this, which requires us to collaborate with the Nordic team to resolve the issue. And they seem to redirect here.

We need this issue to be resolved as soon as possible. It would be great if you could have a chat with the team responsible for fixing it and let them know that it is their responsibility to take care of it. We don't want to keep going back and forth with Nordic and Zephyr about this.

Kindly inform us once the issue is fixed.

carlescufi commented 9 months ago

@pnchatterji @morinv-actif I had a look into this, and found out that reverting 3aff1ff0c2ff0a0b2a3aab0ae4c547ac206200e9 fixes this issue.

@cfriedt could you please check out this branch: https://github.com/carlescufi/zephyr/tree/repro-66132 and then build:

west build -b qemu_x86 samples/hello_world

This will fail with undefined reference to '_open'. If you then revert 3aff1ff0c2ff0a0b2a3aab0ae4c547ac206200e9 it will build fine.

cc @keith-packard since what seems to be the offending commit also includes a reference to picolibc.

cfriedt commented 9 months ago

I am building an application based on a Nordic platform (nrf52840) using NRF Connect SDK (NCS) which encapsulates

@pnchatterji - I'm afraid I cannot realistically support the nrf sdk in my spare time, unfortunately, but I will make some notes.

I use fopen() in my code for opening files (it is a customer requirement to use the POSIX API for file handling). I get the following build error:

I would refer you to the POSIX conformance and Suprofiling Option Groups (under POSIX_DEVICE_IO) pages.

https://docs.zephyrproject.org/latest/services/portability/posix/aep/index.html https://docs.zephyrproject.org/latest/services/portability/posix/option_groups/index.html

fopen() is has actually never been supported. It is clearly documented as unsupported. Since that is one of the features of the POSIX_DEVICE_IO option group, we also cannot say that we support that option group yet either (although I would really love to!).

I do appreciate that many Nordic customers (and many others) would like to use standard POSIX APIs in Zephyr and have things Just Work.

The work that needs to happen for that is (mostly) outlined in https://github.com/zephyrproject-rtos/zephyr/issues/51211 .

In particular, we very much require an extensible native Zephyr file-descriptor I/O library, as integer file descriptors are commonly used for everything from shared memory, to filesystems, to sockets. The open() and fopen() POSIX calls should be wrappers that eventually lead to a common zvfs_open() call. IIRC, the reason that _open() is referenced is because we are missing an implementation of fopen() in POSIX.

cc @jgl-meta @jukkar @rlubos

BTW, starting on this would be the most awesome gift of all this holiday season 😉

D:/ncs2/toolchains/c57af46cb7/opt/zephyr-sdk/arm-zephyr-eabi/arm-zephyr-eabi/lib/thumb/v7e-m/nofp\libc.a(lib_a-openr.o): in function _open_r': openr.c:(.text._open_r+0x14): undefined reference to_open'

Can you please confirm whether CONFIG_POSIX_FS=y in your configuration as well, just out of curiousity?

Note that if I set CONFIG_NEWLIB_LIBC=n CONFIG_NEWLIB_LIBC_NANO=n i.e. if I use the default Pico clib, it works fine. I repeat that this is an issue only if I use Zephyr 3.3.99 and higher. It works fine with Zephyr 3.2.99 and lower. Clearly the POSIX support for Newlibc is broken .

POSIX support for Newlib is, in fact, not broken, and I would refer you to the rather extensive set of tests here.

tests/posix/fs/testcase.yaml

You are just attempting to use features from a sub-profiling option group that are not (yet) fully supported.

I would be happy to accept a change that modifies this as long tests/posix and tests/net (others?) are run on all supported architectures. First I would suggest trying CONFIG_POSIX_FS=y.

So, in short, @carlescufi - PRs are welcome. But what would be even more welcome are the structural changes and POSIX features required to support Nordic customers.

I would also advise Nordic users to not attempt to use POSIX features that are not yet supported in Zephyr, however tempting that might be.

Help from Nordic on desired POSIX features would be greatly appreciated!

Feel free to add (or implement) features in the LTSv3 Roadmap https://github.com/zephyrproject-rtos/zephyr/issues/51211.

cfriedt commented 9 months ago

It seems my comment is duplicated on mobile for some reason. My apologies for spamming.

keith-packard commented 9 months ago

Hrm. newlib doesn't use the POSIX open/close/read/write functions, instead it uses the newlib-specific _open/_close/_read/_write functions which offer similar semantics with different names. I'm unsure why https://github.com/zephyrproject-rtos/zephyr/commit/3aff1ff0c2ff0a0b2a3aab0ae4c547ac206200e9 removed the definition of the newlib functions that are required for newlib's stdio to work? I'm not sure this is strictly a POSIX conformance issue; I think it's an issue of providing the API that newlib uses to implement the C stdio api on top of Zephyr file APIs.

cfriedt commented 9 months ago

I'm unsure why 3aff1ff removed the definition of the newlib functions

It did not remove the definition of newlib functions, just ensured that they were not duplicated (which would be the case if either CONFIG_NEWLIB_LIBC=y or CONFIG_PICOLIBC=y. Apparently something else relied on those symbols being present at the time.

keith-packard commented 9 months ago

It did not remove the definition of newlib functions, just ensured that they were not duplicated (which would be the case if either CONFIG_NEWLIB_LIBC=y or CONFIG_PICOLIBC=y. Apparently something else relied on those symbols being present at the time.

Weird. Sounds like something changed then. I can't find another definition of _open in the tree except when CONFIG_POSIX_API is not set. (in lib/libc/newlib/libc-hooks.c).

carlescufi commented 9 months ago

@cfriedt thank you for all the feedback. You asked:

Can you please confirm whether CONFIG_POSIX_FS=y in your configuration as well, just out of curiousity?

I can confirm that enabling this does not fix the issue or help in any way, I have it set in my branch and, based on other conversations, I know that @pnchatterji has tried it too.

@pnchatterji said:

I use fopen() in my code for opening files

and then @cfriedt said:

fopen() is has actually never been supported.

There's something that doesn't match here, so let me ask:

carlescufi commented 9 months ago

It did not remove the definition of newlib functions, just ensured that they were not duplicated (which would be the case if either CONFIG_NEWLIB_LIBC=y or CONFIG_PICOLIBC=y. Apparently something else relied on those symbols being present at the time.

Weird. Sounds like something changed then. I can't find another definition of _open in the tree except when CONFIG_POSIX_API is not set. (in lib/libc/newlib/libc-hooks.c).

Me neither, I was wondering the exact same. @cfriedt are you sure that https://github.com/zephyrproject-rtos/zephyr/commit/3aff1ff0c2ff0a0b2a3aab0ae4c547ac206200e9 did the right thing?

cfriedt commented 9 months ago

@cfriedt thank you for all the feedback. You asked:

Can you please confirm whether CONFIG_POSIX_FS=y in your configuration as well, just out of curiousity?

I can confirm that enabling this does not fix the issue or help in any way, I have it set in my branch and, based on other conversations, I know that @pnchatterji has tried it too.

I wasn't implying that it should fix anything. Was just curious.

@pnchatterji said:

I use fopen() in my code for opening files

and then @cfriedt said:

fopen() is has actually never been supported.

There's something that doesn't match here, so let me ask:

  • @pnchatterji are you sure fopen() works for you on older Zephyr versions?
  • @cfriedt are you sure fopen() has never worked?

I have no idea if fopen() has or has not worked out of happenstance at any time.

I just know that it (and most other FILE * based APIs) have never been supported in Zephyr. The table that I referenced was carried over from the original POSIX maintainer, @pfalcon .

cfriedt commented 9 months ago

It did not remove the definition of newlib functions, just ensured that they were not duplicated (which would be the case if either CONFIG_NEWLIB_LIBC=y or CONFIG_PICOLIBC=y. Apparently something else relied on those symbols being present at the time.

Weird. Sounds like something changed then. I can't find another definition of _open in the tree except when CONFIG_POSIX_API is not set. (in lib/libc/newlib/libc-hooks.c).

If there is one thing that is constant, it's change ;-)

fdtable was not part of posix at the time. If it is categorized as part of posix, that is somewhat recent.

Note also, config posix api is just an option that enables individual posix options. It was possible (still is?) to use many posix features (FS, etc) individually without config posix api.

Me neither, I was wondering the exact same. @cfriedt are you sure that https://github.com/zephyrproject-rtos/zephyr/commit/3aff1ff0c2ff0a0b2a3aab0ae4c547ac206200e9 did the right thing?

It was necessary at the time. Yes.

keith-packard commented 9 months ago

fdtable was not part of posix at the time. If it is categorized as part of posix, that is somewhat recent.

fdtable isn't, but all of its POSIX functions are. And, that only includes read, _read, write, _write, close, _close, lseek, _lseek, ioctl and fcntl.

Given that the fdtable stuff defines underscore aliases for some of its functions, it seems odd to me that lib/posix/fs.c doesn't define _open?

pnchatterji commented 9 months ago

@carlescufi yes, I confirm that fopen() works with Zephyr 3.2.99 and older for NewLib, and STILL works on newer versions with Picolibc. The issue is only relevant for Zephyr 3.3.99 and newer, and ONLY for Newlib. The issue is not just a matter of preference. It is also about portability.

cfriedt commented 9 months ago

@carlescufi yes, I confirm that fopen() works with Zephyr 3.2.99 and older for NewLib, and STILL works on newer versions with Picolibc. The issue is only relevant for Zephyr 3.3.99 and newer, and ONLY for Newlib. The issue is not just a matter of preference. It is also about portability.

@pnchatterji - feel free to make a PR.

krish2718 commented 5 months ago

@cfriedt I am also seeing the issue now (after an upmerge in NCS) openr.c:(.text._open_r+0x14): undefined reference to_open'` , and reverting https://github.com/nrfconnect/sdk-zephyr/commit/3aff1ff0c2ff0a0b2a3aab0ae4c547ac206200e9 solves the issue for me.

From a quick look I guess the wrappers are conditional, so, might not always be included?

cfriedt commented 5 months ago

@cfriedt I am also seeing the issue now (after an upmerge in NCS) openr.c:(.text._open_r+0x14): undefined reference to_open'` , and reverting https://github.com/nrfconnect/sdk-zephyr/commit/3aff1ff0c2ff0a0b2a3aab0ae4c547ac206200e9 solves the issue for me.

From a quick look I guess the wrappers are conditional, so, might not always be included?

@krish2718 - where is openr.c? I cannot see it in the zephyr project repo

Zephyr's fs subsystem is very much in need of an overhaul. It hasn't really been touched in some time.

If reverting that commit in Zephyr works without causing any CI failures, please feel free to make a PR.

krish2718 commented 5 months ago

where is openr.c? I cannot see it in the zephyr project repo

It's part of the libc-newlib, it expects _open to be defined, but without the alias only open is defined.

If reverting that commit in Zephyr works without causing any CI failures, please feel free to make a PR.

Yes, it does work, removing all calls to open/fopen also works, let me raise a PR once I can repro it on tests/posix/fs /portability.posix.fs.tls.newlib, as the tests are passing.

cfriedt commented 3 months ago

@krish2718 - this should be fixed by #73047. Can you please confirm?