Open achantavy opened 2 years ago
From @ryan-lane https://github.com/lyft/cartography/pull/826#issuecomment-1109419468
Could we separate out the analysis jobs that are intended to run along with the code, from the ones that are intended to connect nodes between modules? It may be good to put analysis and cleanup files for modules directly into their modules.
To be honest I wasn't aware that the folder was run at the end of every run and for anyone that changes the value of that option it's not how it runs.
Description:
By default, the final cartography sync stage runs all analysis jobs located in the cartography/data/jobs/analysis folder. I noticed that there are cases where we call
run_analysis_job()
out of band such as when syncing iaminstanceprofiles and analyzing lambda-to-ecr relationships, in S3 acls, everything here, and probably others.The problem with these one-off calls of analysis jobs is that by default they will all be run a second time when we reach the final sync stage. This is wasted work and adds time to the sync especially on a large graph.
To summarize,
run_analysis_job()
.Please complete the following information::
0.56.0