mitre / magma

MITRE Caldera's user interface plugin powered by VueJS
Apache License 2.0
3 stars 7 forks source link

Fix for high network utilization from operation api endpoint #51

Closed timbrigham-oc closed 4 months ago

timbrigham-oc commented 4 months ago

I recently opened a ticket for caldera, 2980 that is more closely related here.

For reference, this is for a 5.0.0 Caldera instance running with the latest build for Magma.

After disabling the minification for vite I was able to track down the cause. The calls to the API endpoint api/v2/operations were definitely coming from operationStore.js (I added a console.error each time it was called to be sure it was the same path.

vite.config.js

....
  build: {
    rollupOptions: {
      external: [
        /gui/
      ]
    },
    // trbdk3 - disable minify on the output JS code since we are debugging an issue
    minify: false
  },
....

operationStore.js

  actions: {
    async getOperations($api) {
      try {
        console.error("Calling get v2 operations" );
        const response = await $api.get("/api/v2/operations");
        // TODO: Sort operations
        for (const operation of response.data) {
          this.operations[operation.id] = operation;

From here I located the calls to getOperations in OperationsView.vue. The function name selectOperation lines up with non minified code.

I'm not for sure what that code is supposed to be doing, but on the current release there is something wrong the this logic.

OperationsView.vue

  updateInterval = setInterval(async () => {
    if (operationStore.selectedOperationID !== "") {
      await operationStore.getOperations($api);
    } else {
      clearInterval(updateInterval);
    }
  }, "3000");

My best guess is for some reason the time is getting set to zero from the setInterval call, and the updates are running constantly.

When I made the updateInterval 30 seconds and entered it as an int vs a string it worked as I think is intended, without DOSing my host.

I'm not sure how many people have hundreds of operations with hundreds of line items each, but that api/v2/operations gets large in a hurry for me.. Having this be a configurable option - and having a higher default in my opinion - would be worth consideration.

OperationsView.vue

  updateInterval = setInterval(async () => {
    if (operationStore.selectedOperationID !== "") {
      await operationStore.getOperations($api);
    } else {
      clearInterval(updateInterval);
    }
    }, 30000);
elegantmoose commented 4 months ago

Im wondering if we are passing the wrong type for the time variable to the setInterval().

Did you notice any improvement when just passing the int 3000 instead of str "3000"?

timbrigham-oc commented 4 months ago

I did some checking on that - setInterval should accept a string and contort it into an int. I didn't see any changes with our without the quotes unfortunately.

While I was trolling through the source code I did see a note about doing some sorting on the operations returned right after the call to the /api/v2/operations endpoint that was killing my bandwidth. Here maybe?

It makes me think that what is really needed for a proper long term solution (instead of slowing down the console refresh rate would be a new endpoint, maybe /api/v2/active_operations that would only respond with the details for for actively running operations.

If I'm understanding the flow correctly, this is what is occurring.

Magma OperationsView.vue calls getOperations stored in operationStore.js when the https://xxx/operations page is loaded.

operationStore.js calls $api.get("/api/v2/operations")

Caldera /api/v2/operations calls get_operations here

get_operations calls get_all_objects here

get_all_objects calls find_and_dump_objects here

Comments

This makes me think that the fix for this is still in Magma, since that's where the call to this endpoint is occurring.

The find_and_dump_objects function looks like (again, I'm a PowerShell guy moreso then Python) it should have support for search criteria, and get_all_objects should be passing it, if provided. Since we are getting back everything from that endpoint, I don't think any filters are being passed.

The include / exclude statements make it looks like the HTTP request should have include / exclude filters defined. I have no idea what this should look like or I'd try running it against my /api/v2/operations endpoint to see.

For what it's worth, calling the /api/v2/operations?exclude=test and /api/v2/operations?include=test have the results I'd expect. Nothing for the include (I'm sure my matching criteria are bad) and everything for the exclude.

If you could provide what that filter syntax should look like @elegantmoose I'd love to try it.

Once that filtering narrowed down to live actions, I'm thinking that adding the filtering into operationStore.js call to the endpoint would be the long term fix?

timbrigham-oc commented 4 months ago

I can't believe I didn't notice the swagger docs here.

Looks like something like this in the Magma would do it, maybe? I'm sure my syntax is off, not much of a JavaScript guy either.

operationStore.js

  actions: {
    async getOperations($api) {
      try {
        // get only the status and ID for later lookups
        responseMetaData = await $api.get("/api/v2/operations?include=state&include=id?sort=finish");
        // iterate each operation from the metadata
        for (const operation of responseMetaData.data) {
        // if it is an ongoing operation get updates
         if( operation.state -ne "finished" )
         {
               // get a response for this specific operation ID 
               const response =  await $api.get("/api/v2/operations/$(operation.id)");
               const insertMe = response.data
               this.operations[insertMe.id] = insertMe;
timbrigham-oc commented 4 months ago

I dug into this a bit further.. That getOperations is used a ton of other places so that changing that behavior doesn't seem a viable method.

I'm playing with an alternative function that does the same thing getOperationsMetadata that could be called instead of operationStore.getOperations in that specific point in the loop, something where the data being returned could be filtered.

Maybe something like this? const response = await $api.get("/api/v2/operations?sort=created&include=id&include=name&include=start");