tarantool / vshard

The new generation of sharding based on virtual buckets
Other
100 stars 30 forks source link

Log fiber #364

Closed Serpentian closed 1 year ago

Serpentian commented 2 years ago

Closes #107

Gerold103 commented 1 year ago

Didn't review the patch yet, but if you are doing the option from the big Alexey's comment, then most importantly don't forget to expose 'status' for the background workers. The log history isn't too important but the status is.

Serpentian commented 1 year ago

Didn't review the patch yet, but if you are doing the option from the big Alexey's comment, then most importantly don't forget to expose 'status' for the background workers. The log history isn't too important but the status is.

The question here is how we'll work with these fibers states. Are they supposed to be focused on errors or not? E.g. if there's an error occured during fiber execution, should we set the state of the fiber to error and wait till the error is gone?

If we do it a way Alexey described in the second solution, this error will be dropped pretty soon by sleeping state e.g. and it doesn't have much sense in saving it. In that case we can grep the error message from the logs, which are saved in the same module.

Gerold103 commented 1 year ago

E.g. if there's an error occured during fiber execution, should we set the state of the fiber to error and wait till the error is gone?

We should keep the error state until all becomes fine.

If we do it a way Alexey described in the second solution, this error will be dropped pretty soon by sleeping state

Sleeping state shouldn't exist, Alexey probably gave it as an example. Any fiber is sleeping except yours anyway, it is not multithread (call fiber:status() on any fiber except self, and it will be suspended). One could argue that this means "state machine sleeping", but still I think the error state should be the strongest one - IOW it can only be overridden by good state. It doesn't have to be a single field though.

Another option is that we could have separate state and activity. State would be error/balanced/good/full/etc. Activity would be backoff/idle/discovering/sending buckets/etc. Essentially, state would be like vshard.storage.info().status - a simple string which tells whether all is good or bad. It is very useful for monitoring too, not just for logs. For example, cartridge GUI could translate the states to colors like it does for whole instances depending whether they are alive.

Exact names for each background service states could be decided during the review. Or you could try to make an RFC with the whole design in terms of API at least. In "Discussions" tab of this repo. It would also allow to collect some feedback from potential users (usually there are 2 people who give feedback to almost everything).

Serpentian commented 1 year ago

Or you could try to make an RFC with the whole design in terms of API at least.

Sounds like a good idea. I'll write the RFC ASAP. Current implementation doesn't work in the way you described. It's better to discuss everything more precisely before making changes.

Serpentian commented 1 year ago

RFC is already up there: https://github.com/tarantool/vshard/discussions/365. It would be great to discuss, how all of this should work, when you have some time for this, so I could continue working on it.

Serpentian commented 1 year ago

The main problem with writing tests with such services is test hanging. I suppose, we should log, what are we doing with services.