Open RheingoldRiver opened 1 year ago
I'd say we de facto have "long-term stability of usage" - anyone using mwclient now has to be pretty used to how it works at present =) I guess I wouldn't want to make any overly radical changes. Improving documentation would definitely be a good idea.
I have never written much async stuff and don't have any interest in discord bots myself, so I'm not sure I have a lot of opinions about that side. It seems like aiomwclient is basically....mwclient, only with async def
everything? As I said, I'm not that familiar with async, so...how disruptive would it be to merge that kind of thing into mwclient proper, but preserving the synchronous interface for users who have no interest in async usage?
One thing to consider regarding stability is that in a 1.0 release, if we retain page.save
, we will certainly need to keep it forever - currently both page.save
and page.edit
do the same thing. IMO there is no question that we should keep both (I've personally always used page.save
) but if there's some reason to use edit
only, this change should be made in 1.0 I think.
I'm not sure if there's any other similar aliases that exist.
It seems like aiomwclient is basically....mwclient, only with async def everything?
It's been a while since I worked on it, but most of the logic is (or should be) the same. The main difference is that all the HTTP requests are performed using an async library (aiohttp
) instead of a sync one (requests
).
To have both in one package, there's a few options, the most prominent in my opinion are
requests
or aiohttp
(or some other sync/async network libraries)asyncio.run()
)Option 1 would likely perform the best as there doesn't have to be a new event loop for every synchronous request. Both would probably perform the same for async calls.
Downside is that you'll need to have async libraries as dependencies (asyncio, aiohttp) even when a user only wants synchronous code.
I'm not super familiar with what Python packages can do, but if they have something like Rust's crate features, you might be able to avoid needing the async dependencies for the sync API.
I'm gonna suggest we go ahead and do a 1.0 release with more or less what we have now (we will need to come up with a new way of doing releases, though, as I think the old way was tied to travis; we can use a script like I do for my personal projects, or a GHA workflow maybe). I don't think it makes sense to delay a new release to meet all the 'gold standard' requirements - though we should still work towards that! - and I don't think it makes any sense for the project to be versioned 0.x, really. Practically speaking this is a stable library that downstreams have been relying on for years (I'm one of them). 1.x versioning would be appropriate.
I agree with @RheingoldRiver that we should keep page.save
and page.edit
and just accept that both will exist forever. It's fine, there's no shortage of verbs. :)
Similarly on the async stuff, I don't think it should hold up doing a 1.x release now. I think we can definitely look at it, though. I'll file a separate ticket for that.
I'm out of town at the moment, so my involvement will be quite low for the next few weeks, but agreed with all the above!
Heads up on this for other maintainers @waldyrious @btongminh @marcfrederick - I've finally got a bit of spare time and am trying to set up a release workflow now following the guide at https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/ . You'll probably see some notifications from pypi and github related to this.
https://github.com/mwclient/mwclient/pull/331 adds a release workflow.
OK, I just tested the release workflow using testpypi and a test repo - https://github.com/AdamWill/mwclient-test , https://test.pypi.org/manage/project/mwclient/releases/ . as you can see, it seems to work fine.
So, I propose to do a v1.0.0 release later today by just doing bump-my-version bump major
and pushing the resulting tag. That should automatically produce a pypi and github release.
Do any other maintainers think we're not ready yet? @mwclient/maintainers
Sorry I've been a bit busy, but is #259 addressed? I would propose to introduce chunk_size
as an alias for limit, expose max_items
, and deprecate limit
not necessarily with the goal of removing it, but so that we have a clear warning that limit
isn't what it seems to be
oh, no, I don't believe we did anything about that. I haven't looked at it in detail but your proposal sounds sensible, do you want to send a PR?
yep, I should be able to within the next couple days!
Looking at the milestone, there's also https://github.com/mwclient/mwclient/issues/197 .
Sounds good to change it but not sure what I think we should change it to. I think it might be the least confusing to accept **kwargs
and send that on to the request, but if we wanted to keep it a named param what about extra
? What do you think?
let's discuss it over in that issue.
Okay I've started this and it's quite a bit more complex a refactor than I'd anticipated. I'm not sure I feel comfortable doing this right before a major release, what do you think?
@RheingoldRiver you mean https://github.com/mwclient/mwclient/issues/259 ? Hmm...maybe we can just add a deprecation warning to limit
saying its replacement will be determined in future? Or just the "rename it to chunk_size
and deprecate it" bit? What exactly is the complex bit?
limit
is referenced in dozens of different functions and is probably passed as a kwarg on at least one occasion. Once I started making the change I realized I'd be touching a lot of code to cover everywhere that it's used.oh, yeah, hmm, I see...there are even a couple of places (random
and exturlusage
at least) where we pass a specific value which looks like we're expecting it to work the same way the OP thought it did. random()
definitely does not work as documented:
>>> for pg in wk.random("QA", limit=2):
... print(pg)
...
OrderedDict({'id': 21941, 'ns': 0, 'title': 'SystemConfig/bind/review'})
OrderedDict({'id': 36951, 'ns': 114, 'title': 'Test Day:2010-10-14 OpenLDAP/NSS'})
OrderedDict({'id': 13590, 'ns': 6, 'title': 'File:Artwork FC7Themes Fc7ThemeProposalFlyingHighRound2 kdeScreenShot.png'})
OrderedDict({'id': 51713, 'ns': 6, 'title': 'File:F20-7.jpg'})
OrderedDict({'id': 31068, 'ns': 1, 'title': 'Talk:Features/Sugar 0.88'})
OrderedDict({'id': 46177, 'ns': 2, 'title': 'User:Rajan2251'})
OrderedDict({'id': 85143, 'ns': 116, 'title': 'Test Results:Fedora 37 RC 1.4 Installation'})
OrderedDict({'id': 9941, 'ns': 0, 'title': 'Infrastructure/RFR/FedoraMagazine'})
OrderedDict({'id': 12478, 'ns': 6, 'title': 'File:Artwork MarketingCollateral poweredby sticker mono.png'})
OrderedDict({'id': 51878, 'ns': 116, 'title': 'Test Results:Fedora 20 Beta TC4 Desktop'})
OrderedDict({'id': 3480, 'ns': 0, 'title': 'QA/TestCases/BootMethodsInstallFromBootIsoIfs'})
OrderedDict({'id': 15489, 'ns': 2, 'title': 'User:Tw2113'})
OrderedDict({'id': 27371, 'ns': 2, 'title': 'User:Melan'})
OrderedDict({'id': 84867, 'ns': 6, 'title': 'File:Mairacanal.jpg'})
OrderedDict({'id': 7013, 'ns': 0, 'title': 'L10N Kannada Team'})
OrderedDict({'id': 29037, 'ns': 6, 'title': 'File:Renderchecklog-plouj-radeon.gz'})
OrderedDict({'id': 24458, 'ns': 14, 'title': 'Category:Ambassadors from Kyrgyzstan'})
OrderedDict({'id': 79496, 'ns': 2, 'title': 'User:Lruzicka/Testcase switch stream'})
OrderedDict({'id': 57468, 'ns': 0, 'title': 'Fedora 23 talking points'})
OrderedDict({'id': 8169, 'ns': 0, 'title': 'FirstBoot'})
...
that's a lot more than 2! Same applies to page.revisions()
, and several other places, I think.
So on the one hand this definitely looks like a big change, OTOH it does kinda seem like something it would be best to fix before doing a 1.0 release...
It does kinda look like almost all the usages follow a pattern of taking a limit
arg and passing it through to something from listing
. So we could probably write a little helper to handle the deprecation for all of them. On the whole...it would definitely be a big commit but I think I'd lean towards getting it done before 1.0. Any thoughts from the rest of @mwclient/maintainers ?
How about making this change to limit
, and then releasing mwclient 0.11.0? Then we could aim for a 1.0 release around the time that mediawiki 1.43 happens, with no changes planned for that milestone unless we need to fix something from 0.11.0. That should give enough time to test the changes widely before we announce 1.0
Sure. We could also call it something like 0.90.0 to make it clear it's an RC, but I'm bikeshedding :) Would also shake out any non-obvious issues in the new release pipeline too, I guess.
Yeah, I'd also be in favor of releasing v0.11.0
before v1.0.0
. It's been more than three years and almost 50 merges (1,088 additions and 270 deletions across 32 files) since the release of v0.10.1
, so there's a good chance a bug or two will surface once all those changes are released. If all goes well we can then release that version more or less unchanged as v1.0.0
.
So, there are more complications! We have two large PRs that sort of conflict: https://github.com/mwclient/mwclient/pull/336 adds type annotations to the entire codebase, https://github.com/mwclient/mwclient/pull/299 adds user management (but doesn't have type annotations). I do not have a ton of time to work on these, as you can tell from how they're lying around.
We still don't have the limit
change finished - @RheingoldRiver , are you still working on that? If not I'm happy to go ahead and send a PR with one of my suggestions (probably the first one, as it's easier).
I think I'd be in favor of doing a 0.11.0 with that change if possible but not with either of the others, then trying to land those two (and anything else) before a 0.90.0 / 1.0.0 release. Thoughts? I've set up a 0.11.0 milestone and split the harder issues into a 1.0.0 milestone (including a couple old ones Dan tagged for 0.9.1 but never fixed, AFAICS).
Sure, sounds like a good plan. Let's do the v0.11.0 release and then merge #299 (user management) before #336 (type annotations). After #299 lands, I will rebase and update the PR with the limit and user management-related changes.
We could even delay the type annotations until after the release of v1.0.0, getting that release out first and then releasing the annotations as v1.1.0. Adding the annotations is arguably not a breaking change, as it won't break any existing code. Worst case, some editors might show errors/warnings, but there will be no runtime changes.
sure, strictly speaking that makes sense, I guess it's just that anything that touches every file in the project (more or less) "feels" like a 1.0 change to me! But logically speaking that might be the way to go indeed.
@AdamWill I haven't been working on the limit change, and I'd rather not tbh - that one change feels like it touches every single part of the codebase, and I don't have a good enough mental map of how the Listing classes work to be comfortable doing that change.
OK, I'll send a PR for my first idea then and we can see what folks think of that. I think it's kinda the 'safest' approach.
Speaking of PRs, I think #327 and #338 both seem good to me and we can merge
OK, so we merged the limit thing, and it would be good to get the writeapi fix out. @mwclient/maintainers , does anyone have an objection to doing a 0.11.0 release now? I will let this sit for a day and try to do a release tomorrow if nobody can think of a reason why not. Thanks!
I'm good with a release tomorrow, I'll start using it immediately and report if I find any errors 👍
Following up from this comment, what should a 1.0 release of mwclient look like?
Things to consider:
One other thing I'd like to consider is that Discord bots are becoming more and more popular, with discord.py and discord.js being the two main libraries. mwclient should be the go-to library for dpy, however, we don't have any async capability making it ill-suited for discord. Maybe we can add async functions for at least login, read, edit? This client does exist and I think it would be great for it to be merged into mwclient proper. I just spoke to @Wadu436 and he's interested in at least discussing the issue, even if not necessarily contributing code. It's possible that it doesn't make sense to combine these into one library; however, if they remain separate projects I think one is more likely to go unmaintained, which I think is unfortunate.