Closed cofinalsubnets closed 1 year ago
Thank you for your work! The idea of moving mutexes to repos seems fine, it is a progress in the proper direction. The issues I see so far are:
Ideally, this would definitely reduce many race conditions but it won't defeat them, as all maps which can be accessed concurrently may show those issues again. I do not see an easy way out of this, until the major pacoloco refactoring gets fixed
BTW, I am saying those things hadn't tested your work so far, sorry
Sure, i can add that test. Can you elaborate on how these errors could still happen? It seems like there are only two maps left (config.Repos
and downloadingFiles
), and of those, only downloadingFiles
could be read/written concurrently, since config.Repos
isn't written to after it's parsed from YAML. And downloadingFiles
is already behind a mutex. So i don't think concurrent map writes are possible anymore, but i could be missing something.
Thank you for your worj @gw000. I like the idea of using finer-grained concurrency.
@Focshole I updated the tests to only ignore the mutex, and check the timestamps. I also added a config test to make sure the new timestamp fields (which are exported to work with cmp.Equal
) won't be unmarshalled from YAML, if the default keys should happen to be present for some reason.
IMO using RWMutexes would be premature at this point. There would only be one or two very quick reads that could be protected with read locks, and it would add complexity to a critical code path for not much performance gain.
@anatol Thank you for creating this project!
Sure, i can add that test.
I mean, it is enough to add the same checks you see for the whole configs also to the individual structures that should have been checked before. Something like "if want.repos!=nil..." and then you check the "got" "want" with the structures. Sorry if I have explained myself poorly, I wrote some comments about it
Can you elaborate on how these errors could still happen? It seems like there are only two maps left (
config.Repos
anddownloadingFiles
), and of those, onlydownloadingFiles
could be read/written concurrently, sinceconfig.Repos
isn't written to after it's parsed from YAML.
Well, not exactly. The config repos can be updated if it is specified through mirrorlist (it has to read periodically the mirrorlist file if it has been modified, then periodically update the relevant repo urls). The race condition scenario I see is the following:
With a RWMutex, this scenario can be avoided, as read locks allow concurrent reads but not writing when reads are happening :+1:
And
downloadingFiles
is already behind a mutex. So i don't think concurrent map writes are possible anymore, but i could be missing something.
True
Maybe i'm confused but i don't think the race condition you're describing could arise. If the mirrorlist has been updated, the thread that discovers that will lock the repo, preventing any other thread from reading the URLs/starting to download until the mirrorlist has been parsed and the sources have been updated. If, hypothetically, the other thread won the race, then it would be the one to find the mirrorlist out of date, and the other thread would wait. Again, i think an RWMutex is very much overkill here.
Maybe i'm confused but i don't think the race condition you're describing could arise. If the mirrorlist has been updated, the thread that discovers that will lock the repo, preventing any other thread from reading the URLs/starting to download until the mirrorlist has been parsed and the sources have been updated. If, hypothetically, the other thread won the race, then it would be the one to find the mirrorlist out of date, and the other thread would wait. Again, i think an RWMutex is very much overkill here.
The logic behind avoiding race conditions involves defining proper access control to the relevant structures that are being accessed concurrently both RW. The structures that I see that are being accessed concurrently are:
Repo.LastMirrorlistCheck
Repo.LastModificationTime
Repo.URLs
Thanks to your work, they are much smaller structures than before. I see that both
Repo.LastMirrorlistCheck
Repo.LastModificationTime
Have been protected with Repo.mutex
. (BTW, shouldn't it be Repo.Mutex in order to be accessed publicly? Probably I'm wrong here). However Repo.URLs
access seems unprotected.
In this scenario, This instruction:
https://github.com/gw000/pacoloco/blob/fix-map-concurrency/mirrorlist.go#L60
And this instruction:
https://github.com/gw000/pacoloco/blob/fix-map-concurrency/pacoloco.go#L311
Can be executed at the same time.
What about yet another mutex :smile: ? Something like Repo.urlsMutex
. Going with a normal mutex would be nonsensical, as most access are just reads that can be done concurrently. Golang won't make writes starve, it is in its design (can't find the source, sorry).
repo.urlsMutex.RLock() before the URLs are being used repo.urlsMutex.RUnlock() after the URLs are being used repo.urlsMutex.Lock() only when they are actively being updated
What do you think?
EDIT: I just realized I am wrong
Thanks for explaining, i see the error now. I'll add another mutex.
Thanks for explaining, i see the error now. I'll add another mutex.
well, wait, the original scenario is wrong, but the conclusion that it may occasionally happen is still right.
With the blocking mutex on checkAndUpdateMirrorlistRepo
, the only occasion it could happen is if pacoloco has a quite high volume of requests, in which:
for
cycle because of the second requestI added read locks around the two repo.URLs
loops in handleRequest
and prefetchRequest
, and a write lock around the mirrorlist update.
Unless i'm very confused, the tests already do what you want. IgnoreFields
is selective about the fields it ignores, that's why it takes their names as arguments. https://pkg.go.dev/github.com/google/go-cmp/cmp/cmpopts#IgnoreFields
That seems to be a bug in master. Or is it even a bug? There's no comparison at all in that test, seems like it just makes sure parsing doesn't panic.
My fault, sorry for bothering in those tests :facepalm: I tested them and they do what they should. Sorry again
If mutexes should not be capitalized (i am not sure about what we should do about it), this is ok to be merged IMO!
I don't mind. I left them lowercase just to make sure YAML would ignore them, and because unlike the time fields, they don't need to be accessible to cmp.Equal
. They could also be uppercase with a yaml:"-"
annotation. Let me know which you prefer :slightly_smiling_face:
It is indifferent to me, if it works and it is maintainable it is fine to me! Well done! :+1:
Hi, thanks for this great project. The concurrent writes issue from #44 seems to be crashing my pacoloco installation every couple of days. I think this change will fix the problem without adding too much extra complexity: the top-level
time.Time
-valued maps are removed and the times are stored on eachRepo
instead; and the write toconfig.Repos
is removed by storing*Repo
values and giving each one its own mutex.Please let me know what else i can do to help. Thank you! :slightly_smiling_face: