GothenburgBitFactory / bugwarrior

Pull github, bitbucket, and trac issues into taskwarrior
http://pypi.python.org/pypi/bugwarrior
GNU General Public License v3.0
732 stars 209 forks source link

Break out all services into separate packages? #966

Closed djmitche closed 1 year ago

djmitche commented 1 year ago

I see that we've been using entrypoints to define services for a long time, which suggests that it's possible to define a service in an external pypi package.

That's useful for cases where a user wants to build a sync to some private or proprietary service.

Should we start shipping the various services as separate packages? That might (a) provide a better example for someone wanting to implement their own service package, (b) help ensure the bugwarrior API for services is stable, and (c) reduce the testing challenges for changes to a single service.

In order to maintain compatibility, we could continue to make the bugwarrior package include all of the current services, by depending on bugwarrior-core and bugwarrior-$service for each $service.

We could keep all of the currently-supported services in the same repo, at least for now -- that will make CI a lot easer.

Thoughts? If this seems like a not-bad idea, I can work on it.

ryneeverett commented 1 year ago

I think this is a duplicate of #775. You might take a look at #770 and #777 also, as a potential alternative approach.

I'm all for finding a solution to your use case -- "sync to some private or proprietary service" -- but I'm not convinced by the other supposed advantages:

Should we start shipping the various services as separate packages? That might (a) provide a better example for someone wanting to implement their own service package, (b) help ensure the bugwarrior API for services is stable, and (c) reduce the testing challenges for changes to a single service.

(a) This advantage seems pretty weak to me. Have you seen the documentation on creating a new service? I'm sure there's room for improvement on that page but breaking the examples into separate packages seems like a negligible improvement.

(b) Is stability of the python API an advantage or a disadvantage? I'm probably of the minority viewpoint here (amongst bugwarrior maintainers), but I don't consider bugwarrior's python API to be public, even though we haven't prepended every object with an underscore. I would make the distinction between libraries and applications and would argue that bugwarrior is currently just an application and not a library. I'm not aware of anybody currently using it as a library and there haven't been any complaints about the API changes we've made. I would be pretty open to stabilizing specific aspects of the API if somebody had a need for that, but is there an advantage to stabilizing the whole thing? By making a private API into a public (stable) API, we lose a lot of flexibility. For what gain? (See #791, which is specifically about stabilizing/documenting the python API.)

(c) What are the testing challenges for changes to a single service?

Meanwhile, the overall proposal threatens substantial disadvantages:

  1. From a bugwarrior-core development perspective: (a) Having to maintain stable python API's means we've lost some flexibility and have to always be considering compatibility when refactoring. I'm not convinced that we're that close to an optimal API and I feel like this would discourage major refactoring efforts. (b) Do we endorse third-party service packages? Do we make them optional dependencies? How do we make those decisions?
  2. From a user perspective, there are now new possibilities for confusion and frustration, such as: (a) compatibility issues between service packages and bugwarrior-core (b) inconsistent quality amongst third-party service packages (c) having to navigate a bunch of third-party projects (with competing forks!) to find the service you want
  3. I think there would be a lot fewer working services in the future. I would wager to say that the vast majority of services have been contributed by folks who either no longer use that service or no longer use bugwarrior. The only reason they (might) work today is that we have a centralized repository where they've been maintained in compatibility with core. In a fragmented ecosystem, most services outside the monorepo would not be maintained.

I believe bugwarrior's main value proposition to folks that would want to build a private service is synchronization. As an alternative approach, would it make sense to expose just this functionality as a python API? See #913 where I started a similar refactor. Maybe -- with the necessary adjustments -- the Synchronize class could be a public API that would be sufficient for private service use cases?

Please don't take this as strong opposition to your proposal. I'm skeptical yet open-minded.

djmitche commented 1 year ago

Those are all great points. I'll give it some thought!

djmitche commented 1 year ago

I certainly see the arguments for not having a public API. I can certainly think of other "plugin ecosystems" that have devolved into a chaos of abandonware and random tarballs posted on forums, and that would be a bad place to end up.

The development friction of having to maintain API compatibility has to be balanced against the friction of maintaining a lot of unfamiliar code (which is what I meant by the poorly-chosen phrase "testing challenges"). I see there are specific tox environments for two different versions of Jira. And it looks like VersionOne has been acquired, so maybe their API will change. Or maybe there's a vulnerability in the version of pyac used by Bugwarrior, and the newer versions have a different API. By keeping the services in-tree there's an implicit expectation that the BW devs will fix those things, or decide to drop support. That sounds draining!

To the specific need that brought me here, since you mentioned some alternatives: Bugwarrior has three "levels" of utility, I think: (1) synchronizing some list of issues against a task database; (2) coordinating and configuring such synchronizations, with a single config file and shared config options around tag handling, etc.; and (3) support for specific services. The service I want to talk to is proprietary, so adding it to Bugwarrior (3) is not an option for me. But I'd really like to get (2), so it can be configured alongside the other syncs I have set up. Just exporting the Synchronize class only gets me (1). An unstable API means it's going to be difficult to keep a local patch running based on the creating-a-new-service docs. #770 might be the closest I could get, where I provide translation from the proprietary system into a neutral data format, and then include a pointer to that data in the Bugwarrior config.

At any rate, I'll mark this as closed since it's not a direction you would like to go.

ryneeverett commented 1 year ago

Note I don't know if you personally want to pursue this, but I don't want to leave it looking like a dead end for the next person.

By keeping the services in-tree there's an implicit expectation that the BW devs will fix those things, or decide to drop support. That sounds draining!

I see where you're coming from, but in practice these have been easy calls to make so far. If a service breaks due to external API changes and no user steps up to fix it, I don't feel any obligation to continue support.

I also think we should avoid creating a false choice between keeping the services in tree and providing a public API. I wouldn't see any advantage to moving the services to separate repositories, though I can see where distributing them as separate pypi packages might make for an easier workflow for installing their dependencies as python's "extras" don't necessarily map well to package managers.

Bugwarrior has three "levels" of utility, I think: (1) synchronizing some list of issues against a task database; (2) coordinating and configuring such synchronizations, with a single config file and shared config options around tag handling, etc.; and (3) support for specific services. The service I want to talk to is proprietary, so adding it to Bugwarrior (3) is not an option for me. But I'd really like to get (2), so it can be configured alongside the other syncs I have set up. Just exporting the Synchronize class only gets me (1).

I like this analysis and it maps well to the code base:

  1. db.synchronize
  2. services.IssueService and services.Issue
  3. services/*

From a theoretical standpoint, (2) is a satisfactory argument that stabilizing these base classes' public API would be of significant value to private service implementations. From a practical standpoint, I still think a cost-benefit analysis is in order.

On the cost side, which methods would need to be stabilized? We could figure this out by analyzing which methods our current services call directly.

On the benefit side, how much inherited functionality would a private service really take advantage of? We could figure this out by "de-normalizing" one or two of our current services by moving inherited methods into the child classes.

Note These proposed experiments are just one idea of a route forward, not the only way.

djmitche commented 1 year ago

Thanks for that clarification!

ryneeverett commented 1 year ago

FYI I've been thinking more about this and doing some experimentation. I've started down a refactoring trajectory that I believe will end up revealing a "naturally stable" API that I would be comfortable stabilizing formally, while fixing some conceptual problems with the architecture at the same time. I expect I'll be opening seemingly unrelated PR's in the coming weeks which are justified by their own merits but ultimately motivated by this end goal.

djmitche commented 1 year ago

That sounds exciting! :)