gocd / gocd

GoCD - Continuous Delivery server main repository
https://www.gocd.org
Apache License 2.0
7.05k stars 973 forks source link

ability to delete pipeline history & artifacts on deletion from config #879

Closed srinivasupadhya closed 4 years ago

srinivasupadhya commented 9 years ago

Currently pipeline history is not deleted for the possibility of restoring it, i.e. if you add the pipeline back to config you get all historical runs + artifacts.

Downside:

Positives:

Approach:

tedops commented 8 years ago

Alternatively, a Delete/Purge implementation would be acceptable, where pipelines can be marked for deletion using Delete, but then have an option (under Admin + an API endpoint) to Purge deleted pipelines.

euven commented 8 years ago

We are interested in getting this functionality, as we have a lot of pipelines and pretty high pipeline turnover.

tedops commented 8 years ago

Perhaps one way to accomplish this would be to prompt the user to either (i) restore the old pipeline + artifacts, or (ii) create a new one, and archive the previously-existing pipeline.

If the user chooses to archive the previously-existing pipeline, it can be moved to a PIPELINEARCHIVE table (or similar), with the app behavior based on the following configurable choices:

jasonolmstead33 commented 8 years ago

Any movement on this? We are also in a situation where we would have a lot of pipelines and would love to be able to delete them or move them to a new group via the API..

euven commented 8 years ago

Hi guys,

Any progress here? Due to high pipeline turnover, we are running out of disk space :(

Thanks!

zabil commented 8 years ago

@eugeneventer we are not working on this at the moment.

arvindsv commented 8 years ago

@eugeneventer: If you're running out of space, it will probably not be because of pipeline history (which will be a small part of the DB). It's probably because of artifacts. You can either set them to be automatically deleted or you can delete them yourself. The GoCD janitor might be useful too. It's a tool written by people from the community, to clean up artifacts.

ghost commented 8 years ago

Hey. I also miss this functionality mainly because it is not possible to perform a clean up and continue with clean slate. It's hard to remember what names ware already used and it was surprising for me when old history appeared for a pipeline when I reused old name. Anyway gocd is great, thank you for all the hard work!

mfilteau commented 7 years ago

+1 Hi, I definitively want this feature!

ghost commented 7 years ago

+1, I would love to see this feature just to start with a clean slate. I have been testing how GoCD works, and with that in mind, I would like to create a clean and decent setup. If there is any way to remove everything manually without completely uninstalling gocd and install from scratch, I would love to hear it.

arvindsv commented 7 years ago

If you bring down the server and move/rename the file /var/lib/go-server/db/h2db/cruise.h2.db or C:\Program Files (x86)\Go Server\db\h2db\cruise.h2.db on Windows, and restart the server, it should start with no history at all but your pipelines will be around, since the config is still around.

dudadornelles commented 6 years ago

Would there be a hacky sql statement that we could use to clean up the run history of a pipeline right in the db? Suffering with performance...

arvindsv commented 6 years ago

@dudadornelles - No. Perf issue might not be just that. In fact, it's very rare for it to be that. That part is heavily cached upon startup.

arvindsv commented 6 years ago

The "No" is because the pipeline history is used to generate the VSM, etc. and we're not sure what happens to the rest of the app, if history is deleted in parts. Meaning, can't say what it will affect. Sorry. :(

dudadornelles commented 6 years ago

I have a couple of pipelines with 40k+ runs and anytime we open them things get pretty slow. The other pipelines with much less history run just fine though.

arvindsv commented 6 years ago

I tried a pipeline on an old GoCD server of ours with 150,000 runs and it was fine. I'm not sure it's correlated directly, but it seems to be the case for you. I'll try and find some time to look into your issue, soon.

FarshadHadei commented 5 years ago

Hi, any updates regarding this issue?

gosiapolitowska commented 5 years ago

Hi, any updates?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had activity in the last 90 days. If you can still reproduce this error on the master branch using local development environment or on the latest GoCD Release, please reply with all of the information you have about it in order to keep the issue open. Thank you for all your contributions.

lmachacek commented 1 month ago

It is a pity, that this hasn't been implemented :frowning_face:

For example, we run smoke tests each 5 minutes and then GoCD tables like pipelinematerialrevisions and buildstatetransitions swelled up and contained tens of millions of rows after few years. Thanks to that GoCD started to be slower, required more and more memory and its start time was around 10 minutes!

For those facing the same troubles, here is what we have done:

  1. Stop GoCD
  2. Run these queries (it will remove history for all pipelines matching the query!):
    
    DELETE FROM pipelinematerialrevisions WHERE pipelineid IN (SELECT id FROM pipelines WHERE name LIKE '%-smoke-tests-%' OR name LIKE '%-fitness-functions');

DELETE FROM modifications WHERE pipelineid IN (SELECT id FROM pipelines WHERE name LIKE '%-smoke-tests-%' OR name LIKE '%-fitness-functions');

DELETE FROM buildstatetransitions WHERE stageid IN (SELECT s.id FROM pipelines p, stages s WHERE (p.name LIKE '%-smoke-tests-%' OR p.name LIKE '%-fitness-functions') AND p.id = s.pipelineid);

DELETE FROM pipelines WHERE name LIKE '%-smoke-tests-%' OR name LIKE '%-fitness-functions';


3. Start GoCD

In our case the result reduction was:
|Table | Before | After
|---------|-----------|------------|
|buildstatetransitions | 32 039 070 | 4 126 849 |
|pipelinematerialrevisions | 11 591 439 | 878 363 |
|pipelines | 4 960 490 | 313 732 |

Now GoCD starts 5x faster and doesn't need 20GiB of RAM :wink: 
chadlwilson commented 1 month ago

I don’t think this ticket would solve your problem directly, just workaround it by allowing you to delete the pipeline and re-add it. Do you automatically create temporary pipelines and then throw them away or something? Having 5M pipelines in your history seems pretty extreme.

Worth noting for others that deleting the history like this may have implications for other pipeline dependencies, ability to fan-in correctly etc. So be careful.

What database are you using?

lmachacek commented 1 month ago

We use PostgreSQL v12 RDS in AWS.

No, we don't have any temporary pipelines. We have for example pipeline DEV-smoke-tests-shared-infrastructure, which is executed each 5 minutes. So, that pipeline has been executed 499 303 times until now and it means that it has 499 303 records in pipelines table. This is the reason, why we had so huge amount of rows in those DB tables.

chadlwilson commented 1 month ago

Yeah, that's certainly an interesting and pretty extreme case, but yes this feature here would be a "workaround" for the root issue you seem to be reporting, which is severely degraded performance (especially of overall dashboard and startup time, rather than something affecting only that pipeline) with pipelines that have a huge number of runs.

I can't think of anything specific to a pipeline's # of run instances, revisions or transitions that should affect server start-up time, and shouldn't affect dashboard either. I'm certainly somewhat surprised to hear it would have that effect on performance of dashboard and memory usage as that would sound like an excessive cache inflation or fetching issue.

How many such logical pipelines did you have that allowed you to remove ~4.6M pipeline run instances from the PIPELINES table with the above commands?

To document the root cause and allow a better chance of looking at it, would it be possible to raise a separate issue that includes the summary of your setup here, and an approximation of what the VSM across multiple pipelines (dependencies, basically) that these smoke-tests pipelines would have looked like? If they are standalone (which allowed you to do the above commands safely without affecting upstream/downstream dependencies that'd also be useful to know)

lmachacek commented 1 month ago

OK, I'll create separate issue.

BTW: we have 36 logical pipelines which allowed us to remove ~4.6M pipeline run instances And the slow startup is caused by loading pipelines timeline and loading of active pipelines into cache.

Before cleanup (and this is quite fast example - sometimes it took almost 10 minutes):

2024-04-17 20:05:44,586 INFO  [WrapperJarAppMain] PipelineRepository:78 - Start updating pipeline timeline
2024-04-17 20:07:19,498 WARN  [WrapperJarAppMain] PipelineRepository:101 - updating in memory pipeline-timeline took: 94910 ms
2024-04-17 20:10:56,199 INFO  [WrapperJarAppMain] PipelineRepository:84 - Pipeline timeline updated
2024-04-17 20:10:56,202 INFO  [WrapperJarAppMain] PipelineSqlMapDao:133 - Loading active pipelines into memory.
2024-04-17 20:10:56,211 INFO  [WrapperJarAppMain] PipelineSqlMapDao:381 - Retriving Active Pipelines from Database...
2024-04-17 20:11:10,810 INFO  [Thread-96] PipelineSqlMapDao:417 - Loading pipeline history to cache...Started
2024-04-17 20:11:10,810 INFO  [Thread-95] PipelineSqlMapDao:411 - Loading Active Pipelines to cache...Started
2024-04-17 20:11:10,810 INFO  [WrapperJarAppMain] MaterialRepository:239 - Loading PMRs,Remaining 3 Pipelines (Total: 3)...
2024-04-17 20:11:10,814 INFO  [Thread-95] PipelineSqlMapDao:414 - Loading Active Pipelines to cache...Done
2024-04-17 20:11:10,829 INFO  [Thread-96] PipelineSqlMapDao:421 - Loading pipeline history to cache...Done
2024-04-17 20:11:10,841 INFO  [WrapperJarAppMain] MaterialRepository:315 - Loading modifications, Remaining 3 PMRs(Total: 3)...
2024-04-17 20:11:10,850 INFO  [WrapperJarAppMain] PipelineSqlMapDao:135 - Done loading active pipelines into memory.

After cleanup:

2024-04-20 19:41:15,340 INFO  [WrapperJarAppMain] PipelineRepository:78 - Start updating pipeline timeline
2024-04-20 19:41:45,327 WARN  [WrapperJarAppMain] PipelineRepository:101 - updating in memory pipeline-timeline took: 29985 ms
2024-04-20 19:41:57,392 INFO  [WrapperJarAppMain] PipelineRepository:84 - Pipeline timeline updated
2024-04-20 19:41:57,402 INFO  [WrapperJarAppMain] PipelineSqlMapDao:133 - Loading active pipelines into memory.
2024-04-20 19:41:57,411 INFO  [WrapperJarAppMain] PipelineSqlMapDao:381 - Retriving Active Pipelines from Database...
2024-04-20 19:42:19,219 INFO  [Thread-95] PipelineSqlMapDao:411 - Loading Active Pipelines to cache...Started
2024-04-20 19:42:19,219 INFO  [Thread-96] PipelineSqlMapDao:417 - Loading pipeline history to cache...Started
2024-04-20 19:42:19,220 INFO  [WrapperJarAppMain] MaterialRepository:239 - Loading PMRs,Remaining 3 Pipelines (Total: 3)...
2024-04-20 19:42:19,223 INFO  [Thread-95] PipelineSqlMapDao:414 - Loading Active Pipelines to cache...Done
2024-04-20 19:42:19,241 INFO  [Thread-96] PipelineSqlMapDao:421 - Loading pipeline history to cache...Done
2024-04-20 19:42:19,304 INFO  [WrapperJarAppMain] MaterialRepository:315 - Loading modifications, Remaining 3 PMRs(Total: 3)...
2024-04-20 19:42:19,317 INFO  [WrapperJarAppMain] PipelineSqlMapDao:135 - Done loading active pipelines into memory.
chadlwilson commented 1 month ago

Ahh right. Based on those log timestamps

2024-04-17 20:10:56,211 INFO  [WrapperJarAppMain] PipelineSqlMapDao:381 - Retriving Active Pipelines from Database...
2024-04-17 20:11:10,810 INFO  [Thread-96] PipelineSqlMapDao:417 - Loading pipeline history to cache...Started
2024-04-17 20:11:10,810 INFO  [Thread-95] PipelineSqlMapDao:411 - Loading Active Pipelines to cache...Started

...the likely main slowness here is this query:

https://github.com/gocd/gocd/blob/67fbd63486f5ac819c74e63996ee8ef7ba3763dd/server/src/main/resources/com/thoughtworks/go/server/dao/maps/Pipeline.xml#L435-L447

which builds on

https://github.com/gocd/gocd/blob/67fbd63486f5ac819c74e63996ee8ef7ba3763dd/server/src/main/resources/com/thoughtworks/go/server/dao/maps/Pipeline.xml#L282-L294

To yield

SELECT
        pipelines.id as pipelineId, pipelines.name as pipelineName, buildCauseType, label, buildCauseMessage,
        pipelines.counter as pipelineCounter, pipelines.label as pipelineLabel, pipelines.naturalOrder as naturalOrder,
        pipelines.comment as pipelineComment, stages.name as stageName,stages.counter as stageCounter, stages.id as stageId, stages.approvedBy as approvedBy, stages.cancelledBy as cancelledBy,
        stages.approvalType as approvalType, stages.result as stageResult, stages.latestRun, stages.rerunOfCounter,
        builds.id as buildId, builds.name as buildName, builds.state as buildState, builds.result as buildResult,
        builds.scheduledDate as scheduledDate, stages.orderId as orderId
        FROM pipelines
        INNER JOIN stages ON stages.pipelineId = pipelines.id AND stages.latestRun = true
        INNER JOIN builds ON builds.stageId = stages.id AND builds.ignored != true
                INNER JOIN (
            SELECT stages.pipelineId AS id
            FROM stages
            WHERE stages.state = 'Building' AND latestRun = true
        UNION
            SELECT max(pipelines.id) AS id
                FROM pipelines
                GROUP BY pipelines.name
        ) pipelines_to_show ON pipelines.id = pipelines_to_show.id
        ORDER BY pipelines.id DESC, stages.orderId ASC

It'd be important to see if the time gap is exactly the same place for the VERY slow examples too, though. There's not much reason it should vary a lot in its timing if it's just the query, unless there are problems on/with your database.

Probably too late now since you have cleaned things up, but I wonder if semantics are different with in-memory database to PostgreSQL, and whether looking at the query plan to see what it's doing might help.

Will move this to relevant issue at some point, just recording here for now.