spdk / spdk

Storage Performance Development Kit
https://spdk.io/
Other
3.09k stars 1.2k forks source link

Voluntarily spdk_thread_exit() and spdk_thread_destroy() calls failed application shutdown #1161

Closed shuhei-matsumoto closed 4 years ago

shuhei-matsumoto commented 4 years ago

Please use the issue tracker only for reporting suspected issues.

See The SPDK Community Page for other SPDK communications channels.

spdk_thread_exit() set thread->exit to true and spdk_thread_destroy() frees thread resource. However they don't notify them to reactor. Hence we could not make any thread call spdk_thread_exit() and spdk_thread_destroy() voluntarily.

Expected Behavior

spdk_thread_exit() removes the passed thread from reactor, i.e., deschedule the passed thread.

Current Behavior

Reactor shutdown detects error when any thread calls spdk_thread_exit() and spdk_thread_destroy() voluntarily during run time.

Possible Solution

Patches will come soon.

Steps to Reproduce

During runtime, SPDK application creates SPDK thread and then exit and destroy it. Then shutdown SPDK application.

Context (Environment including OS version, SPDK version, etc.)

Latest SPDK master

shuhei-matsumoto commented 4 years ago

Please give me a few days. I hope patches will be ready soon.

shuhei-matsumoto commented 4 years ago

Patches are ready to review now. The main patch is https://review.gerrithub.io/c/spdk/spdk/+/482469/1 Thanks in advance.

shuhei-matsumoto commented 4 years ago

Sorry, the patch is not ready to review yet. I found a solution and I'm developing it. Please stay tune for a while.

ak-gh commented 4 years ago

I noticed this a while back, and reported on the mailing list. You might want to consult the respective discussion.

Regards, Andrey

shuhei-matsumoto commented 4 years ago

Hi @ak-gh

Thank you, I'm glad you noticed this issue :)

Yes, I already read your bdevperf patch in detail and got good insight. Definitely I will take a look at the discussion again, and will ask your help.

Please stay tune shortly.

-Shuhei

shuhei-matsumoto commented 4 years ago

Hi All,

We are still busy for the next release but the patch series are ready to review starting from https://review.gerrithub.io/c/spdk/spdk/+/479390/11

I tried to incorporate all discussion by Ben and Andrey on mailing list and Vitaliy’s patch.

I completed iSCSI first.

I’ll move to bdevperf and NVMe-oF target in the mean time

Thanks Shuhei

ak-gh commented 4 years ago

On Wed, Jan 29, 2020, 14:33 shuhei-matsumoto notifications@github.com wrote:

Hi All,

We are still busy for the next release but the patch series are ready to review starting from https://review.gerrithub.io/c/spdk/spdk/+/479390/11

I tried to incorporate all discussion by Ben and Andrey on mailing list and Vitaliy’s patch.

Hi Shuhei,

nice to see this moving forward. Could you please point me to the specific patch re spdk_thread?

Thanks, Andrey

I completed iSCSI first.

I’ll move to bdevperf and NVMe-oF target in the mean time

Thanks Shuhei

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WB56I6L27X7U2UP57ZTRAFSQLA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKG37SI#issuecomment-579715017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WB7ACMZQ7ENZ3RAJXA3RAFSQLANCNFSM4KKPF3VQ .

shuhei-matsumoto commented 4 years ago

Hi @ak-gh

The main is https://review.gerrithub.io/c/spdk/spdk/+/482469/16

https://review.gerrithub.io/c/spdk/spdk/+/483159/2 https://review.gerrithub.io/c/spdk/spdk/+/483160/2 https://review.gerrithub.io/c/spdk/spdk/+/483162/2 https://review.gerrithub.io/c/spdk/spdk/+/483163/3 https://review.gerrithub.io/c/spdk/spdk/+/483164/3 https://review.gerrithub.io/c/spdk/spdk/+/483165/3 https://review.gerrithub.io/c/spdk/spdk/+/483166/4 Are also important.

Most are to handle io channel, io device, poller, and message correctly.

The first use case is iscsi target. https://review.gerrithub.io/c/spdk/spdk/+/481800/21

I’m working on bdevperf and NVMe of target next

I’ll add more documentation tomorrow

Thanks Shuhei

ak-gh commented 4 years ago

On Wed, Jan 29, 2020, 15:59 shuhei-matsumoto notifications@github.com wrote:

Hi @ak-gh https://github.com/ak-gh

The main is https://review.gerrithub.io/c/spdk/spdk/+/482469/16

Thanks for the pointers, I'll take a look at the series.

Regards, Andrey

https://review.gerrithub.io/c/spdk/spdk/+/483159/2 https://review.gerrithub.io/c/spdk/spdk/+/483160/2 https://review.gerrithub.io/c/spdk/spdk/+/483162/2 https://review.gerrithub.io/c/spdk/spdk/+/483163/3 https://review.gerrithub.io/c/spdk/spdk/+/483164/3 https://review.gerrithub.io/c/spdk/spdk/+/483165/3 https://review.gerrithub.io/c/spdk/spdk/+/483166/4 Are also important.

Most are to handle io channel, io device, poller, and message correctly.

The first use case is iscsi target. https://review.gerrithub.io/c/spdk/spdk/+/481800/21

I’m working on bdevperf and NVMe of target next

I’ll add more documentation tomorrow

Thanks Shuhei

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WB35OW2UHUTVXHQNHGTRAF4UBA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKHDF2A#issuecomment-579744488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WB23CY4WKWVS5CQLLS3RAF4UBANCNFSM4KKPF3VQ .

ak-gh commented 4 years ago

I went through the spdk_thread_exit related patches, and would like to share my impressions.

I believe the series represents a massive change, motivated and caused primarily by the decision to mark thread exited inline under spdk_thread_exit. This causes a lot of issues addressed by other patches in the series.

I believe the key inline exit decision may be avoided, resulting in a more efficient and much more concise solution as outlined in the pseudo-code below.

  1. Drop spdk_thread_destroy from the public API. This should only ever get called by the reactor that runs the thread being destroyed. Also, drop spdk_set_thread from the public API.

  2. Add static function _spdk_thread_exit under reactor.c as follows:

static void _spdk_thread_exit(void arg) { struct spdk_thread thread = arg; thread->exit = 1; }

  1. Update spdk_thread_exit to verify that all the per thread resources have been released and, on positive determination, to send _spdk_thread_exit message to self and invalidate thread pointer.

int spdk_thread_exit(struct spdk_thread *threadp) { / Validate threadp and thread=threadp, fail with -ENOTTY otherwise /

/* Validate every channel under thread->io_channels has zero refcount,

  1. Under spdk_thread_poll, upon each message being processed check if thread is marked as exited, and break if set.

  2. Under reactor main loop, check if thread is marked exited upon return from thread_poll. If set, remove respective lwp from the reactor queue, and destroy the thread.

Assuming messages are delivered in order, the above guarantees the thread will consume and process resource-related messages (such as _spdk_put_io_channel) prior to receiving _spdk_thread_exit message, whereupon it may be safely deleted.

Regards, Andrey

On Wed, Jan 29, 2020, 15:36 Andrey Kuzmin andrey.v.kuzmin@gmail.com wrote:

On Wed, Jan 29, 2020, 14:33 shuhei-matsumoto notifications@github.com wrote:

Hi All,

We are still busy for the next release but the patch series are ready to review starting from https://review.gerrithub.io/c/spdk/spdk/+/479390/11

I tried to incorporate all discussion by Ben and Andrey on mailing list and Vitaliy’s patch.

Hi Shuhei,

nice to see this moving forward. Could you please point me to the specific patch re spdk_thread?

Thanks, Andrey

I completed iSCSI first.

I’ll move to bdevperf and NVMe-oF target in the mean time

Thanks Shuhei

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WB56I6L27X7U2UP57ZTRAFSQLA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKG37SI#issuecomment-579715017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WB7ACMZQ7ENZ3RAJXA3RAFSQLANCNFSM4KKPF3VQ .

shuhei-matsumoto commented 4 years ago

Hi @ak-gh

Thank you for reviewing the patch series. I'm glad to read your detailed comments.

Yes, we will be able to do (1) then. From (2) to (6),

If we take your idea, we may need to repeat message until freeing IO device. Freeing IO device may be deferred until the final spdk_put_io_channel() is called. We can't request the thread to call spdk_put_io_channel() and don't know when it comes.

Besides, I think we want to stop receiving new message or registering new poller after spdk_thread_exit().

In particular for (3), even if we invalidate thread pointer in spdk_thread_exit(), the later message handler or poller will still see the valid thread pointer.

So I may not understand your idea fully yet but I want to keep the current idea. Of course I will read your comment more today and may follow your idea.

Thanks, Shuhei

ak-gh commented 4 years ago

On Thu, Jan 30, 2020, 01:26 shuhei-matsumoto notifications@github.com wrote:

Hi @ak-gh https://github.com/ak-gh

Thank you for reviewing the patch series. I'm glad to read your detailed comments.

Yes, we will be able to do (1) then. From (2) to (6),

If we take your idea, we may need to repeat message until freeing IO device.

I'm not quite sure as to why. Correct thread behavior, IMO, is to release I/O channels and unregister pollers. At this point, there's nothing else the thread is required to do to exit as the rest occurs in/is the responsibility of the respective subsystems.

In the specific case of the I/O device, freeing it up will occur once thread processes all _spdk_put_io_channel messages to itself (and this is guaranteed to happen before it receives _spdk_thread_exit message), so I don't see any reason for the repetition.

Thanks, Andrey

Freeingg IO device may be deferred until the final spdk_put_io_channel() is called. We can't request the thread to call spdk_put_io_channel() and don't know when it comes.

Besides, I think we want to stop receiving new message or registering new poller after spdk_thread_exit().

In particular for (3), even if we invalidate thread pointer in spdk_thread_exit(), the later message handler or poller will still see the valid thread pointer.

So I may not understand your idea fully yet but I want to keep the current idea. Of course I will read your comment more today and may follow your idea.

Thanks, Shuhei

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WB7N6LFIQH4FSE5OTBTRAH7CTA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKI7JVA#issuecomment-579990740, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WBZVXDYZ6L3L3E63OWDRAH7CTANCNFSM4KKPF3VQ .

shuhei-matsumoto commented 4 years ago

Hi @ak-gh

The thread which unregisters I/O device and the thread which closes the last I/O channel may be different. In this case, the later thread will send message to the former thread, and the former thread will complete unregistration. Until the later thread sends the message, the former thread will repeat sending self message.

And what we want to do is to ensure the exiting thread never receive any new message or register any new poller. Setting thread->exit first will be the reliable way.

This may be personal opinion but I don't think "the series represents a massive change". I think the patch series follows Ben's original idea.

I hope you didn't say my patch series would not work :)

-Shuhei

ak-gh commented 4 years ago

On Thu, Jan 30, 2020, 10:14 shuhei-matsumoto notifications@github.com wrote:

Hi @ak-gh https://github.com/ak-gh

The thread which unregisters I/O device and the thread which closes the last I/O channel may be different. In this case, the later thread will send message to the former thread, and the former thread will complete unregistration. Until the later thread sends the message, the former thread will repeat sending self message.

I believe this isn't the case. Nothing in the io device API or implementation requires this.

And what we want to do is to ensure the exiting thread never receive any new message or register any new poller.

The only reliable way to achieve this is to put the responsibility on the programmer. Once thread has called spdk_thread_exit, sending it any message or registering a poller is a bug (and that's why I suggested spdk_thread_exit should invalidate thread pointer).

Settingg thread->exit first will be the reliable way.

This may be personal opinion but I don't think "the series represents a massive change". I think the patch series follows Ben's original idea.

I hope you didn't say my patch series would not work :)

Nope, I believe it works. My only concern is that it causes a lot of extra effort

Regards, Andrey

-Shuhei

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WB55SJVDII3N7CPOM33RAJ44NA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKJ5W6I#issuecomment-580115321, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WB2TP2XE2IEI7EZCJX3RAJ44NANCNFSM4KKPF3VQ .

shuhei-matsumoto commented 4 years ago

Thank you @ak-gh

I want to get review to the whole series first. I'm glad if it's OK for you.

In the end of the series, iSCSI target is implemented so that the master threads calls io_device_register(), and then create children threads and then children threads call get_io_channel() at startup. vice-versa at shutdown. I hope the patch series fit the design well. If we change that design, it may be needed to change the whole series. I'm happy to change if so.

By the way, according to the brief discussion with Ben, Ben suggested me to create thread not per io_target_group but per io_target. In bdevperf, io_target is per CPU core and per bdev, and io_target_group is per CPU core. I'm working on it in the mean time. FIO creates threads per job.

Thanks, Shuhei

shuhei-matsumoto commented 4 years ago

I read your last comment again.

I'm glad to get your consensus about "Settingg thread->exit first will be the reliable way." Invalidating thread pointer may be temporary because reactor may set it again at the next schedule, and I feel setting the flag is a little more safe than nullifying the pointer.

Anyway thank you @ak-gh

ak-gh commented 4 years ago

On Thu, Jan 30, 2020, 10:40 shuhei-matsumoto notifications@github.com wrote:

Thank you @ak-gh https://github.com/ak-gh

I want to get review to the whole series first. I'm glad if it's OK for you.

In the end of the series, iSCSI target is implemented so that the master threads calls io_device_register(), and then create children threads and then children threads call get_io_channel() at startup. vice-versa at shutdown. I hope the patch series fit the design well. If we change that design, it may be needed to change the whole series. I'm happy to change if so.

If that's what is underpinning the exit patches, I don't think designing a key subsystem so as to suit the behavior of a single application is any good approach.

Let me restate, just to be clear: I don't see anything in the thread API that would prevent the message-based exit solution. Further, I believe deferred thread exit to be much more inline with the generic asynchronous SPDK design.

Regards, Andrey

By the way, according to the brief discussion with Ben, Ben suggested me to create thread not per io_target_group but per io_target. In bdevperf, io_target is per CPU core and per bdev, and io_target_group is per CPU core. I'm working on it in the mean time. FIO creates threads per job.

Thanks, Shuhei

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WB3AXRZ57A33XA7BAPLRAJ76ZA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKJ7WYQ#issuecomment-580123490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WB26GUIHPYYMUM7SDG3RAJ76ZANCNFSM4KKPF3VQ .

shuhei-matsumoto commented 4 years ago

Hi @ak-gh

I don’t think my patches will limit use cases, and Ben went through my patches. If we set thread->exit we don’t have to use message.

Anyway let’s get more review as I said.

Will you provide any usecase or corner case such that my patches would not work? I put many unit tests

See you tomorrow Shuhei

shuhei-matsumoto commented 4 years ago

One question to @ak-gh

Your use case uses custom io channel and custom io devices?

ak-gh commented 4 years ago

On Thu, Jan 30, 2020 at 1:17 PM shuhei-matsumoto notifications@github.com wrote:

One question to @ak-gh https://github.com/ak-gh

Your use case uses custom io channel and custom io devices?

Yes, I do that as well.

Regards, Andrey

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WBZ73HWQQIGOXAHA6DTRAKSMBA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKKOOHA#issuecomment-580183836, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WB3JIL25RF77W7TZSW3RAKSMBANCNFSM4KKPF3VQ .

shuhei-matsumoto commented 4 years ago

Hi @ak-gh I used my head when I walked to home :) Honestly I don’t get why message is necessary. Thread destroy is determined by reactor. Reactor is just underneath thread. On the other hand message is above thread.

So my patch should be simple and ok. Ben agreed overall once. Let’s get his comment again

shuhei-matsumoto commented 4 years ago

@ak-gh i know bdev layer have used message passing for unregistering

ak-gh commented 4 years ago

On Thu, Jan 30, 2020 at 1:35 PM shuhei-matsumoto notifications@github.com wrote:

Hi @ak-gh https://github.com/ak-gh I used my head when I walked to home :) Honestly I don’t get why message is necessary. Thread destroy is determined by reactor. Reactor is just underneath thread. On the other hand message is above thread.

So my patch should be simple and ok. Ben agreed overall once. Let’s get his comment again

The key issue with setting thread->exit inline, IMO, is that it immediately breaks thread messaging (since reactor won't process any messages from that point on). This enforces inline channel removal and other housekeeping that self-messaging approach cleanly avoids: _spdk_put_io_channel messages are processed as usual, no changes necessary. And that's what I actually like about turning spdk_thread_exit into spdk_thread_send_msg(_spdk_thread_exit)

  • it doesn't break anything.

Regards, Andrey

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spdk/spdk/issues/1161?email_source=notifications&email_token=AJG6WB2N25BCXBHMLNH4K3TRAKUQXA5CNFSM4KKPF3V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKKQIKQ#issuecomment-580191274, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJG6WBYYAG7MUQEPNVAJN7DRAKUQXANCNFSM4KKPF3VQ .

shuhei-matsumoto commented 4 years ago

I removed that break from message handler. Did you see my patch?

shuhei-matsumoto commented 4 years ago

Hi @ak-gh

As you felt, I changed overall design and now uses the policy to guard and reap. Ben was very busy but read through my patches and gave me a favorable feedback. Of course he may find my mistake today. But I hope you will get similar impression if you have time to read through. And I believe we agree in the end

Thanks Shuhei

shuhei-matsumoto commented 4 years ago

I close this issue because the main patch was merged. https://github.com/spdk/spdk/commit/abb942bda118280ba8230c5c8098caa80fe8c2f4 Thanks for everyone.

Hi @ak-gh,

I hope the merged feature is sufficiently simple and usable. Please feel free re-open or create an new issue if you find any issue.