Open ilya071294 opened 1 month ago
Consider also: how it is differs from user attachment's metadata cache ? why worker atts should be special in this regards ?
PS bug, really ?
PS bug, really ?
Not so far from a deadlock, only the timer helps. If a script is executed when there are no concurrent user attachments, I would expect that there won't be any suspicious hang-ups. It's not critical for this script but for a larger one it can be. A typical case - recreating a bunch of indexes. The worker attachment generally is not special but its role is still different. At least it basically helps to perform a task created by the user attachment. And I wouldn't consider it right that it can impede actions in user attachments this way.
You found a special case and going to fix it with very inefficient, but simple, way. It will not work in general. It is not necessary in most use cases. There should be a way to clear MET cache on purpose, like in DSQL statements cache. It should be used when required, not in advance as in DSQL statements cache.
PS DSQL statements cache also better to be cleared when really required, not in advance as currently, IMHO.
Is it possible that the problem is already solved in @AlexPeshkoff's shared metadata cache branch?
You found a special case and going to fix it with very inefficient, but simple, way.
I wonder how you came to this conclusion. Both parts are false actually :) I mean it wasn't me who found it (the case is practical, not theoretical) and I'm not against finding a better solution. Otherwise I wouldn't ask.
There should be a way to clear MET cache on purpose, like in DSQL statements cache. It should be used when required, not in advance as in DSQL statements cache.
I agree. Then we need to find all places that require to clear the MET cache. Currently in FB5 I see at least 2 places: delete_index
and delete_relation
. MET_clear_cache
is called there to clear the cache of a user attachment. Will it be OK to add here similar calls for worker attachments? I guess this is also not perfect because a worker attachment has a chance to lock the index again before our lock attempt (but it should be possible only if there are other user attachments executing parallel tasks).
On 10/18/24 11:10, Dimitry Sibiryakov wrote:
Is it possible that the problem is already solved in @AlexPeshkoff https://github.com/AlexPeshkoff's shared metadata cache branch?
Right now not ready to give reliable answer - I'm finishing (fingers crossed) with indices, currently code is broken. As far as I understand there will not be such problem with new code but no 100% guarantees.
Then we need to find all places that require to clear the MET cache. Currently in FB5 I see at least 2 places:
delete_index
anddelete_relation
.
Yes
MET_clear_cache
is called there to clear the cache of a user attachment. Will it be OK to add here similar calls for worker attachments?
The point was to use "lock and AST" way to call MET_clear_cache
in all attachments.
For start (not detailed and may be not fully correct):
Jrd::Statement()
and TRA_post_resources()
should take lock in SR mode (allow to put objects into cache), MET_clear_cache()
and release the lockI guess this is also not perfect because a worker attachment has a chance to lock the index again before our lock attempt (but it should be possible only if there are other user attachments executing parallel tasks).
It should not be possible with a scheme above
PS Next time you want to ask, use fb-devel, pls - it is better suited for such kind of exchange of opinions.
Steps to reproduce are kind of similar to #7783 but they show a completely different problem.
Execution hangs for a minute at the last line
drop index t1_idx1
. This happens because worker attachments have been creatingt1_idx2
index recently, and after loadingT1
table metadata they still haveP1
procedure in the metadata cache. The statement of the procedure holds SR lock ont1_idx1
index. A minute later the timer thread releases worker attachments, anddrop index t1_idx1
is finally completed. A simple solution is to callMET_clear_cache
before putting a worker attachment intom_idleAtts
array. I've tried it and it works. On the one hand, clearing the cache can be useful because in a general case the worker attachment can perform totally different tasks. On the other hand, different tasks may use the same objects. Any opinions?