Open gerard5 opened 3 years ago
The cmdline
is a string before parsing (received directly from the plo
in the syspage
):
"Xdummyfs Ximxrt-multi Xpsh Xspawn_test;1 Xspawn_test;2 Xspawn_test;3 Xspawn_test;4 Xspawn_test;5 "
after parsing, the memory content looks like this:
Xdummyfs\0Ximxrt-multi\0Xpsh\0Xspawn_test\0Xspawn_test\0Xspawn_test\0Xspawn_test\0Xspawn_test\0
but whileis being parsed each of the cmdline
item is scanned through prog=syspage->progs
list and compared with prog->cmdline
. This is a serious problem if cmdline
contains multiple programs with the same name, with different or same arguments or without arguments at all. As in the example in the problem description (see screenshots above), these five commands with arguments: spawn_test;1
, spawn_test;2
,spawn_test;3
, spawn_test;4
,spawn_test;5
, lead to 25 processes spawned… what !?
Take a close look at the block with hal_strcmp()
if-statement inside the loop:
https://github.com/phoenix-rtos/phoenix-rtos-kernel/blob/54c5d2c61dd3f474aed1dd376967a5590d2de5a1/main.c#L97-L105
What it actually does, for each processed cmdline
item, it scans through prog=syspage->progs
, and if cmdline+1
matches the prog->cmdline
spawns a single process (for now I'm omitting the performance of this scan loop), it's even worse when the same program name appears more than once (this is the topic of an issue), leads to multiple processes spawned to ^2
As a redesign of the syspage
is not the subject of this issue, the temporary solution is to somehow mark the already spawned prog->cmdine
program, to be skipped in next syspage->progs
loop scan.
Not judging the solution itself, the below temporary hack seems to work:
u32 skips = 0; /* bit index */
In plo
the MAX_PROGRAMS_NB
is set to 32 so uint32
is ok here, though rootfs-less
projects should not go above the limit if so it is a sign to have rootfs
, with which we agree, I suppose.
for (prog = syspage->progs, i = 0; i < syspage->progssz; i++, prog++) {
if (!(skips & (1u << i)) && !hal_strcmp(cmdline + 1, prog->cmdline)) {
skips |= 1u << i;
argv[0] = prog->cmdline;
res = proc_syspageSpawn(prog, vm_getSharedMap(prog), prog->cmdline, argv);
if (res < 0) {
lib_printf("main: failed to spawn %s (%d)\n", argv[0], res);
}
break;
}
}
The result is as expected:
So, if there are better solutions that You guys have let's discuss them or if You agree with the presented approach let me prepare PR and commit.
Description
Spawning multiple processes through
syspage
fails, leading to the number of instances being multiplied to the power of two (see the attached screen shots). I've checkedsyspage
content forwarded fromplo
to thekernel
and it looks ok, thus it seems an issue ofkernel
process spawn. I have verified this only onarmv7m7-imxrt106x
target so it may beNOMMU
specific and may be observed onarmv7m7-imxrt117x
too.This issue may be linked with Jira tasks ([RTOS-1] multiple sysexec usage fails) and [NIL-20], although I've not verifiedpsh
sysexec
behavior, but it looks similar.Simple program which may be used to reproduce:
Aliases are common to all of the cases below, and they are the following:
One instance:
Two instances:
Three instances:
Four instances:
Five instance:
Above and including four spawned instances system stopped responding nether
psh
was working norimxrt-multi
console input. I tried different combination of maps although using (imap=ocram2, dmap=ocram2) any time it looked the same.