jenkinsci / opentelemetry-plugin

Monitor and observe Jenkins with OpenTelemetry.
https://plugins.jenkins.io/opentelemetry/
Apache License 2.0
100 stars 53 forks source link

Capability to manually create spans #956

Closed xBis7 closed 3 weeks ago

xBis7 commented 1 month ago

opentelemetry-plugin issue: https://github.com/jenkinsci/opentelemetry-plugin/issues/90

This PR is based on the work done on PR https://github.com/jenkinsci/opentelemetry-plugin/pull/84.

The patch is adding a new step withNewSpan which allows users to create their own spans.

The option accepts 3 parameters

From the above 3 parameters, only the label is required and everything else is optional.

Example usage

withNewSpan(label: 'custom-build-span', attributes: ([
  spanAttribute(key: 'modules', value: '2'),
  spanAttribute(key: 'command', value: 'mvn clean install')
]), setAttributesOnlyOnParent: true) {
  sh './build-module1.sh'
  sh './build-module2.sh'
}

or

withNewSpan(label: 'custom-build-span', attributes: ([
  spanAttribute(key: 'modules', value: '2'),
  spanAttribute(key: 'command', value: 'mvn clean install')
])) {
  sh './build-module1.sh'
  sh './build-module2.sh'
}

or

withNewSpan(label: 'custom-build-span') {
  sh './build-module1.sh'
  sh './build-module2.sh'
}

Testing done

Screenshot from manual testing

withNewSpan_jaeger_testing

Submitter checklist

xBis7 commented 1 month ago

@kuisathaverat @cyrille-leclerc Can you take a look at this PR?

It's related to this issue https://github.com/jenkinsci/opentelemetry-plugin/issues/90

xBis7 commented 1 month ago

@kuisathaverat Did you get a chance to take a look?

kuisathaverat commented 1 month ago

Sorry for not getting back to you sooner, but I did not find time to perform the manual test needed to verify that there are no regressions. Also, I want to make a huge pipeline test to verify there are no leaks or performance issues.

xBis7 commented 1 month ago

@kuisathaverat Is there something I can do to help? Run some test? I've run manual tests but my use-cases might not match yours.

kuisathaverat commented 1 month ago

I need more time. I have compared two "similar" pipelines to see if the memory usage grows, and it does.

A pipeline with 500 steps uses about 450MB of RAM and does not increase memory usage after it is finished. I executed it 10 times.

pipeline {
  agent any
  stages {
    stage('prepare') {
      steps {
        bunchOfSteps(0)
      }
    }
  }
}

def bunchOfSteps(level) {
    if (level < 100) {
        sh "echo level ${level}"
        bunchOfSteps(level + 1)
    }
}

A pipeline with 500 steps and the new step to create spans uses about 850MB of RAM and increases the memory usage after it finishes about 350MB. I executed it one time.

pipeline {
  agent any
  stages {
    stage('prepare') {
      steps {
        bunchOfSteps(0)
      }
    }
  }
}

def bunchOfSteps(level) {
    if (level < 500) {
        withNewSpan(label: 'custom-build-span-' + level, attributes: ([
            spanAttribute(key: 'level', value: level),
            ]), setAttributesOnlyOnParent: true) {
                sh "echo level ${level}"
                bunchOfSteps(level + 1)
            }
    }
}
xBis7 commented 1 month ago

It would be interesting to compare the RAM usage between

pipeline {
  agent any
  stages {
    stage('prepare') {
      steps {
        bunchOfSteps(0)
      }
    }
  }
}

def bunchOfSteps(level) {
    if (level < 500) {
        withSpanAttributes([
            spanAttribute(key: 'level', value: level),
            ]) {
                sh "echo level ${level}"
                bunchOfSteps(level + 1)
            }
    }
}
pipeline {
  agent any
  stages {
    stage('prepare') {
      steps {
        bunchOfSteps(0)
      }
    }
  }
}

def bunchOfSteps(level) {
    if (level < 500) {
        withNewSpan(label: 'custom-build-span-' + level, attributes: ([
            spanAttribute(key: 'level', value: level),
            ]), setAttributesOnlyOnParent: true) {
                sh "echo level ${level}"
                bunchOfSteps(level + 1)
            }
    }
}

withSpanAttributes might also increase the RAM usage afterwards.

kuisathaverat commented 1 month ago

this one directly crash after 250 steps, but I not see the memory change after several executions

def bunchOfSteps(level) {
    if (level < 500) {
        withSpanAttributes([
            spanAttribute(key: 'level', value: level),
            ]) {
                sh "echo level ${level}"
                bunchOfSteps(level + 1)
            }
    }
}
kuisathaverat commented 1 month ago

I am going to test this one too

def bunchOfSteps(level) {
    if (level < 500) {
        withNewSpan(label: 'custom-build-span-' + level, setAttributesOnlyOnParent: true) {
                sh "echo level ${level}"
                bunchOfSteps(level + 1)
            }
    }
}
kuisathaverat commented 1 month ago

I am going to change the approach recursivity 500 it is a lite too much for a pipeline

kuisathaverat commented 1 month ago
pipeline {
  agent any
  stages {
    stage('prepare') {
      steps {
        bunchOfSteps(0)
      }
    }
  }
}

def bunchOfSteps(level) {
    if (level < 10) {
        recursiveSteps(0)
        bunchOfSteps(level + 1)
    }
}

def recursiveSteps(level){
    if (level < 50) {
        withNewSpan(label: 'custom-build-span-' + level, attributes: ([
            spanAttribute(key: 'level', value: level),
            ]), setAttributesOnlyOnParent: true) {
                sh "echo level ${level}"
                recursiveSteps(level + 1)
            }
    }
}
kuisathaverat commented 1 month ago

I am also testing

pipeline {
  agent any
  stages {
    stage('prepare') {
      steps {
      withNewSpan(label: 'custom-build-span', attributes: ([
        spanAttribute(key: 'level', value: 'foo'),
        ]), setAttributesOnlyOnParent: false) {
            bunchOfSteps(0)
        }
      }
    }
  }
}

def bunchOfSteps(level) {
    if (level < 500) {
        sh "echo level ${level}"
        bunchOfSteps(level + 1)
    }
}
kuisathaverat commented 1 month ago

I need to grab some head dumps and verify there is nothing weird.

kuisathaverat commented 3 weeks ago

The memory issue was a snowflake. I cannot replicate it anymore, but I do not see any issue, so we can merge.

xBis7 commented 3 weeks ago

@kuisathaverat Thanks for the review and the help!

xBis7 commented 3 weeks ago

@kuisathaverat Thanks for merging it. Regarding using the change, should we wait for a new release, or is there some other way?

kuisathaverat commented 3 weeks ago

When we merge a change with labels bug or enhancement a new release is trigger, IIRC it takes 6 hours or so to be in the Jenkins update center https://github.com/jenkinsci/opentelemetry-plugin/releases/tag/3.1419.v3b_27ca_911066

xBis7 commented 3 weeks ago

@kuisathaverat Thanks!