sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.12k stars 1.29k forks source link

Configure SourceGraph to not delete a repository when removed from a Codehost #61852

Open process0 opened 7 months ago

process0 commented 7 months ago

Question description

Is it possible to configure the server to not delete a repository when it is removed from a Codehost? Maybe it would be better to then categorize the repository as Orphaned. Auto deletion by syncing is counter intuitive, especially when repositories are explicitly defined in the Codehost mirrors.

Is your question related to a problem? If so, please describe or link to another issue.

Possibly related https://github.com/sourcegraph/sourcegraph/issues/914

Additional context

There is this FAQ on purging data from deleted repositories: https://sourcegraph.com/docs/admin/how-to/remove-repo#manually-purge-deleted-repository-data-from-disk. There could alternatively just be a button per Codehost configuration to delete all orphaned repositories.

eseliger commented 7 months ago

Hello!

What's the code host type you're using and could you share a version of your code host connection config without sensitive data?

process0 commented 4 months ago

@eseliger I am using a GitHub repository code host with a standard config. My way to prevent repositories from being deleted from disk was setting this in the Site Configuration:

"repoPurgeWorker": {
  "deletedTTLMinutes": 0,
  "intervalMinutes": 0
},

This issue is about marking Sourcegraph repositories as Orphaned when deleted from the Code Host instead of removing them from Sourcegraph. I would still like to be able to search on repositories removed from upstream.

eseliger commented 4 months ago

Ah! I misread initially what the issue was about - that makes sense as a feature to me!
We will talk about prioritization of this - it's a bit difficult as Sourcegraph isn't meant as the system of record today; so we always operate under the assumption "in a disaster case, the code host (system of record) still has a copy of the data".

process0 commented 4 months ago

I believe this could be implemented while still maintaining that Sourcegraph isn't a system of record and still having the same behavior as currently. I am open to contributing to this feature, time and approval permitting. Do I need to prepare and submit an RFC described here?