CXuesong / WikiClientLibrary

/*🌻*/ Wiki Client Library is an asynchronous MediaWiki API client library targeting modern .NET platforms
https://github.com/CXuesong/WikiClientLibrary/wiki
Apache License 2.0
82 stars 16 forks source link

Throttler does not prevent additional threads from performing parallel requests #9

Closed Silic0nS0ldier closed 7 years ago

Silic0nS0ldier commented 7 years ago

While debugging I noticed that the built in throttler is simply a delay before any 'edit' tier modification is performed, meaning that numerous threads could end up submitting changes to a MediaWiki API at the same time.

Additionally, FilePage.UploadAsync does not utilise throttling at all.

It should be possible to implement an inter-thread throttler without introducing any breaking changes.

CXuesong commented 7 years ago

Yeah actually I've noticed this limitation from the very beginning. Because MW doesn't suggest you to access MW API too frequently, and trottling will actually enforce actions (edits, at least) performed in series.

But things can be a little bit different if you'are working with multiple MW sites, and I can try to implement the trottling per site logic using asynchronous semaphore or something.

However, I still think it would be better if such trottling/queuing logic can be imeplemented in the client-side code, because you can control the speed that you produce the edit requests.

Suppose there are a lot of page-editing requests queued up, considering the trottling delay before each edit request, it may take considerable time to finish up all the edit requests. On the one hand, it may take too much memory to hold all the impending edits; on the other hand, if other users happen to edit the page during the (long) delay before your client actually send the edit request, an edit conflict will happen; though this can be detected by OperationConflictException, extra logic might need to be introduced to handle such cases.

Silic0nS0ldier commented 7 years ago

Memory is a valid point, but if the client awaits the edit action then there wouldn't be an issue. Meanwhile a cross-thread throttler would prevent any unintended flooding of the wiki. May also allow for better utilisation of time (no waiting around on the delay when only one edit has been made), which for any GUI application using the library would result in better perceived performance.

CXuesong commented 7 years ago

So now I have introduced a Throttler class which can be accessed by WikiSite.Throttler property.

Btw I decided to make some adjustments on namespace organizations, so WikiClientLibrary.Site is now WikiClientLibrary.Sites.WikiSite. There are a lot of other name changes so I bumped version to 0.6, but the work is not quite done yet. Still, you can take a look at those pre-releases if you're interested. Most changes can be achieved by text replacing and following Roslyn's tips, and/or with the help of some refactoring tools. Just hope you won't feel too bothered…

Silic0nS0ldier commented 7 years ago

No worries with the namespace/class renaming, since this is still a 0.x release, and therefore is implied to be non-final. (semver is a thing though, and nuGet like every package manager respects the standard. Post version 1.0, it's definitely something to adhere to. Devs can get cranky. )

Good to see the throttler improvements up, I'll let you know how it goes the next time I need to perform a job on the wiki.

CXuesong commented 7 years ago

I just feel assured that 0.x implication actually holds :-) Okay, then just help yourself, and re-open this issue if something goes wrong.