Closed mrnugget closed 4 years ago
UpdateCampaignJob needs nulltimeColumn for StartedAt and StartedAt
Does this not break things as it currently is?
I just checked. That line is kinda invalid and I'll remove it from the ticket. We already do the correct thing when creating the CampaignJob:
And that works as intended: FinishedAt
and StartedAt
are set to NULL
when the CampaignJob
is first created. That allows us to query progress while they run.
We don't strictly need this in UpdateCampaignJob
.
Edit: Updated this comment to distinguish between CreateCampaignJob
and UpdateCampaignJob
Closing this, because the majority of the things in this ticket is now either fixed or outdated.
This is a braindump of all the things we ran into while working on #6085. The items on this list range from "nice to have" to "we should do it" to "we need to do this, sooner rather than later"
a8n.Store
repos.Store
anda8n.Store
?a8n.Store
are in one large file and have global statedbtest
database is useda8n.Store
: the column names are duplicated in every query string. We could probably substrings and include them in the other queries.UpdateCampaignJob
needsnulltimeColumn
forStartedAt
andStartedAt
Allow fetching all rows with
Limit: -1
instead ofLimit: $number-we-think-should-be-high-enough
Right now we sometimes fetch all rows of a given table by specifying something like
Limit: 10000
in the hopes of never having to fetch more than that. See the search results here: https://k8s.sgdev.org/search?q=repo:%5Egithub%5C.com/sourcegraph/sourcegraph%24+Limit%5C:%5Cs%2B%5Cd%7B3%2C%7D%2C+file:enterprise/internal/a8n/&patternType=regexpWha we already do for
ChangesetEvents
is to allow specifyingLimit: -1
:https://github.com/sourcegraph/sourcegraph/blob/8248533108ae3f47bd5fcb451b3912a4f0152e0a/enterprise/internal/a8n/store.go#L609-L618
Architecture and Design of a8n code
a8n.Service
andrun.Runner
— they are really similarrun
package?run.Runner
to take adone
channel as last parameter inRun
on which it sends a message when it's finished executing the jobs. That would allow us to get rid of theWaitgroup
and theWait()
method.Use a persistent queue
Extracted into https://github.com/sourcegraph/sourcegraph/issues/6723
Execute ChangesetJobs in parallel
Extracted into https://github.com/sourcegraph/sourcegraph/issues/6722
Error responses in
gitserver
when applying diff failsExtracted into separate issue: https://github.com/sourcegraph/sourcegraph/issues/6717
Changesets are never cleaned up
Right now a user can create multiple changesets with
createChangesets
and never attach them to a campaign and they'll persist.The same happens when a user deletes a campaign: the changesets will stay around and will be synced.
We can probably find some heuristic when it's safe to clean up a changeset, i.e. when a changeset is older than 5 hours and hasn't been attached to a campaign, we delete it.
Enterprise and OSS
a8n
packages, one in ininternal
and one inenterprise/pkg
internal/a8n
fromcmd/repo-updater/repos
becauserepo-updater
has theSources
that can load and createa8n.Changesets
Executing
CampaignType.searchQuery
Right now we use our own wrapper around
searchResolver
calledgraphqlbackend.RepoSearch
. Is it maybe better to usezoekt.Searcher.List
in thea8n.Runner
thangraphqlbackend.RepoSearch
? See the code here @keegancsmith suggested this in https://github.com/sourcegraph/sourcegraph/pull/6309#issuecomment-548691096 (See also https://github.com/sourcegraph/sourcegraph/issues/6627 for more context on this.)We probably want to go through the user-facing search code path for
matchTemplate
(in the case of acomby
campaign) and use structural search there (once it's ready). But do we also need to do the same forsearchScope
?Naming of and in
repos
packagerepos
packagerepos
package should be calledexternalservice
(orcodehost
), since it does a lot more than justrepos
. It could also probably be extracted fromrepo-updater
repos.Source
needs to be calledClient
, because it's more than a "source of repositories" by nowGraphQL layer and a8n
a8n
you need to:schema.graphql
./cmd/frontend/graphqlbackend/a8n.go
that will be implemented by the (enterprise-only)a8n
package./enterprise/pkg/a8n/resolvers/
./enterprise/pkg/a8n/resolvers
which again defines structs that reflect the GraphQL schema to which we can unmarshall the resolver response./enterprise/pkg/a8n/resolvers/resolver_test.go
are really verbose and heavyRepoID
somewhere)Inconsistencies in type definitions
uint32
, sometimesint32
, sometimesint64
api.RepoName
andapi.CommitID
— some use it, some don'tDeveloper UX when dealing with external services, repos and talking to code hosts
httpcli
middlewares inhttpcli
Fix multi-file diffs without extended header in
go-diff
go-diff
has a bug where it doesn't parse multi-file diffs correctly that have no headers between diffs.See this piece of code