Ericsson / codechecker

CodeChecker is an analyzer tooling, defect database and viewer extension for the Clang Static Analyzer and Clang Tidy
https://codechecker.readthedocs.io
Apache License 2.0
2.26k stars 379 forks source link

Give ability to detect if server is loading data. #2417

Open scphantm opened 5 years ago

scphantm commented 5 years ago

I have a 2 pod cluster going on. The way your save method appears to work, the client uploads the zip, then the client gets a response and moves on. A process kicks off on the server that unzips the file and imports the information into the report.

My problem is we want to use data from CodeChecker as a go-no-go stop gate in our overall workflow. We want to say "if the critical issues increases from the previous run, blow the build up". But, your numbers aren't going to be accurate until the server side function is complete.

So my system does multiple builds in parallel. then in parallel uploads the results to codechecker. So at the same time i could have builds build139-rhel7, build139-suse15, build139-bsd10, build139-bsd12, and build139-macos each running on different machines and uploading at the same time (hopefully). On those build nodes, i need a way for the server to know if build139-rhel7 is finished importing the data, that way i can do a comparison between build138-rhel7 and build139-rhel7 and determine if the number of critical and medium bugs went up or went down. if they went up, halt the build and file a bug in jira.

i need a method to tell if the build is finished importing. I would recommend a flag on your getRunData transaction. something like this

RunData(detectionStatusCount={0: 19053, 5: 5}, 
codeCheckerVersion='None (None)', 
runDate='2019-10-15 13:31:15.512850', 
name='myapp', 
versionTag=None, 
runCmd=None, 
runId=1, 
duration=14182, 
resultCount=19053, 
analyzerStatistics={'clangsa': AnalyzerStatistics(successful=3987, failed=392, failedFilePaths=None, version=None)}),
loadStatus={running|failed|finished}

i can then poll getRunData periodically and check loadStatus for the build that node is concerned with, when the value goes finished, i can move on with my workflow.

gyorb commented 4 years ago

We have a cli command CodeChecker cmd products list -o json which contains a runStoreInProgress value. This is a list of the runs where the report storage is ongoing. This is a product level run list. Is that enough information? Maybe I missed it but do you try to store the results to the same run name or for each build build139-rhel7, build139-macos ... you use a separate run name? If you use separate run names, all of them should be in the list where the storage is ongoing.

scphantm commented 4 years ago

Ok, i see it in the product structure. i should be able to use the getProduct apis to pull it into my script. If that works, then yea, its enough. because then i can have a daemon pooling that api very lets say 15 seconds with a specific run name, as soon as it disappears from that list then i can signal everything else to move on. gotta love event driven build pipelines.

whisperity commented 5 months ago

I believe the asynchronous storing system, once complete, will allow this use case, as every store operation will be assigned a unique token with which you can consume the job status. #3672 That way, you will be able to differentiate whether the storing hasn't begun yet (if you're doing the polling from another job), ongoing, or finished.

The way your save method appears to work, the client uploads the zip, then the client gets a response and moves on. A process kicks off on the server that unzips the file and imports the information into the report.

I'm not sure this deduction was ever correct, but it is definitely incorrect as of now. The client waits (and blocks the terminal/script/workflow/etc.) for the response and the response arrives when the server has finished (or deterministically failed) storing the result. So simply waiting out the return value of CodeChecker store should be sufficient for this use case, as long as the "store" and the "diff" happen in the same job (same machine, same script, etc.). (#3672 was created due to an issue we were observing that this result-waiting by the client can hang on the networking stack level, and never return.)

If you want to poll results from another machine then a different approach is needed.