WebAssembly / wasi-threads

145 stars 8 forks source link

tid of the main thread #15

Closed yamt closed 1 year ago

yamt commented 1 year ago

the host allocates tid in wasi:thread_spawn. because the main thread is not spawned by wasi:thread_spawn, it doesn't have a valid tid.

possible solutions i can think of: a. reserve a special tid, say 1, for the main thread. make wasi:thread_spawn never allocate the value. b. introduce gettid equivalent host api, which returns the tid of the calling thread. c. give up the design of host-allocated tids. make libc allocate tids and tell it to the host instead. d. do nothing. accept the fact the main thread doesn't have a tid.

yamt commented 1 year ago

see also https://github.com/WebAssembly/wasi-libc/pull/360

yamt commented 1 year ago

possible solutions i can think of:

i slightly prefer b. because it's similar to linux.

sbc100 commented 1 year ago

I kind of like (c). There was a quite a bit of discussion around this when we were designed in thread_spawn and IIRC I was more in favor on not exposing host TIDs. The only two downsides the I can remember from that discussion were:

  1. It could require the host to maintain a mapping to real TIDs.
  2. If there are two different userspace threading libraries in the same module they might have to coordinate to avoid overlapping userspace TIDs.

I'm unsure if this would make sense to be able to mix thread runtimes in a single module.. for example, what would it mean if a thread that was not created by pthread_create tried to call pthread_self. I think threading runtimes in the same module would not be able to avoid some level of coordination.

Most discussion on this here: https://github.com/WebAssembly/wasi-libc/pull/325#discussion_r975600581

loganek commented 1 year ago

I was more in favor on not exposing host TIDs

I think that's not a problem as hosts can generate thread identifiers independently on the host's TIDs.

I think I'm in favor of option a); as suggested in my comment we can use 0 as a value for the main thread. Yet another advantage of that is that we'll have a standard way of identifying main thread of the program.

sbc100 commented 1 year ago

I was more in favor on not exposing host TIDs

I think that's not a problem as hosts can generate thread identifiers independently on the host's TIDs.

I think I'm in favor of option a); as suggested in my comment we can use 0 as a value for the main thread. Yet another advantage of that is that we'll have a standard way of identifying main thread of the program.

Do we we want to expose these TID's directly to userspace (e.g. in the return value of wasi-libc's gettid()). On unix tid 1 has special meaning as the primordial process, so we might want to avoid returning 0 or 1 from gettid().. but maybe thats just a hypothetical issue.

yamt commented 1 year ago

I was more in favor on not exposing host TIDs

I think that's not a problem as hosts can generate thread identifiers independently on the host's TIDs. I think I'm in favor of option a); as suggested in my comment we can use 0 as a value for the main thread. Yet another advantage of that is that we'll have a standard way of identifying main thread of the program.

Do we we want to expose these TID's directly to userspace (e.g. in the return value of wasi-libc's gettid()). On unix tid 1 has special meaning as the primordial process, so we might want to avoid returning 0 or 1 from gettid().. but maybe thats just a hypothetical issue.

i don't think it makes much sense to expose gettid or TID concept itself to user apps.

my main concern is wasi-libc internal uses of TIDs.

sbc100 commented 1 year ago

I was more in favor on not exposing host TIDs

I think that's not a problem as hosts can generate thread identifiers independently on the host's TIDs. I think I'm in favor of option a); as suggested in my comment we can use 0 as a value for the main thread. Yet another advantage of that is that we'll have a standard way of identifying main thread of the program.

Do we we want to expose these TID's directly to userspace (e.g. in the return value of wasi-libc's gettid()). On unix tid 1 has special meaning as the primordial process, so we might want to avoid returning 0 or 1 from gettid().. but maybe thats just a hypothetical issue.

i don't think it makes much sense to expose gettid or TID concept itself to user apps.

my main concern is wasi-libc internal uses of TIDs.

In that case I think its reasonable to reserve 0 for the main thread. Its seems unlikely any host OS would ever use that as a TID.

yamt commented 1 year ago

I was more in favor on not exposing host TIDs

I think that's not a problem as hosts can generate thread identifiers independently on the host's TIDs. I think I'm in favor of option a); as suggested in my comment we can use 0 as a value for the main thread. Yet another advantage of that is that we'll have a standard way of identifying main thread of the program.

Do we we want to expose these TID's directly to userspace (e.g. in the return value of wasi-libc's gettid()). On unix tid 1 has special meaning as the primordial process, so we might want to avoid returning 0 or 1 from gettid().. but maybe thats just a hypothetical issue.

i don't think it makes much sense to expose gettid or TID concept itself to user apps. my main concern is wasi-libc internal uses of TIDs.

In that case I think its reasonable to reserve 0 for the main thread. Its seems unlikely any host OS would ever use that as a TID.

in https://github.com/WebAssembly/wasi-libc/pull/360 i used the other value (0x3fffffff) because 0 doesn't work with the current wasi-libc usage as commented in the patch.

i agree that, if we reserve a value, it solves the immediate problem. it's option (a).

yamt commented 1 year ago

i submitted https://github.com/WebAssembly/wasi-threads/pull/16 . it's is along the line of (a). but it doesn't hurt other options.

abrown commented 1 year ago

My opinion: I think @sbc100 documents well here the previous reasons we went with option a. I still lean towards that option, though I'm not opposed to re-discussing b and c as well. Since I approved #16 and it continues the status quo of option a, should we close this issue?

yamt commented 1 year ago

My opinion: I think @sbc100 documents well here the previous reasons we went with option a. I still lean towards that option, though I'm not opposed to re-discussing b and c as well. Since I approved #16 and it continues the status quo of option a, should we close this issue?

i don't know what "approved but not merged yet" status means in this repo. if we have decided to reserve tids, i have no problem with closing this issue.

loganek commented 1 year ago

The PR is already merged, I'm closing the issue but feel free to re-open it if needed.