Support agents at image generation time

thegreystone commented 5 years ago

It would be great if agents could be supported at image generation time, so that the instrumented code would be used in the final image.

cstancu commented 5 years ago

How do you envision such a feature? It seems to me that bytecode instrumentation via agents is an orthogonal feature. You can run an agent on your code, instrument it, and then feed it to native-image.

tylerbenson commented 5 years ago

I'm not super familiar with Graal's build options, but what I would be hoping for is a way to reuse the existing javaagent tooling at build time instead of run time.

peter-hofer commented 5 years ago

I think supporting existing Java agents (or even JVMTI agents) in the image build would not be straightforward because they are quite often more dynamic than just instrumenting bytecode when it is first loaded, and agents can also affect the code of the image build itself and not just application and library classes. I'm also not sure whether they work currently work well with JVMCI. This could be feasible to implement in a limited scope without support for redefinition or retransformation.

peter-hofer commented 5 years ago

(closed unintentionally)

bsideup commented 5 years ago

It would be nice if native-image will support one-time class file transformations.

Preferably with the existing Java Agents API (simulating premain), but if a separate API will be required, that's not a blocker for our use cases.

thegreystone commented 5 years ago

Sorry for not following up. Agreed, limited functionality will go a long way - feeding the classes to be part of the image through the agent(s) and allowing them a go at transforming them will go a long way for my use case(s) too.

dougxc commented 5 years ago

I'm also not sure whether they work currently work well with JVMCI

JVMCI and Graal work fine with JVMTI agents (it's a bug if they don't). We also load method substitution bytecode from disk to avoid instrumentation effects.

raphw commented 5 years ago

This would be very relevant for me, too. I have a monitoring application for which I would like to support this. If Graal could mimic the instrumentation API, this would of course be a plus but a different API that only supports registering a ClassFileTransformer for all classes that are added to the image would solve most use cases, I think.

If I may suggest an implementation: I do not think that such an agent should affect the classes of the JVM process that is building the image as this might interfere with the build process. Instead, there should be an API that allows for:

a) Registering a ClassFileTransformer for transforming all classes before being added to the image. The arguments for Module, ClassLoader, Class and ProtectionDomain could simply be null, you basically only need to present the class name and its byte code. b) Offer an API to define additional classes and to find classes that are included in the image such as:

interface GraalInstrumentation {
  void addClassFileTransformer(ClassFileTransformer cft);
  byte[] findClass(String className);
  void addClass(byte[] classFile);
  void include(JarFile file);
  void addConfig(ReflectionConfig config);
  void addEntryPoint(String className);
}

The last method would be used to register classes with some method to execute before the instrumented program that was already added, similar to premain but without the possibility to request an instrumentation instance.

Given this, one could offer an API for the Graal native-image compiler where the Instrumentation interface was replaced with the above. The same should then ideally also work for standard AOT compilation.

As a bonus one could consider to supply a pseudo class loader to the above class file transformer. This class loader could then return class files on the getResourceAsStream method. This would make it a bit easier to reuse existing agent code.

For my purposes, this would be sufficient.

thomaswue commented 5 years ago

@thegreystone Would the approach as described by @raphw also work for your use case?

raphw commented 5 years ago

I prototyped a bit and extended my list of requirements by a few things. With this I could make most of my agents Graal compatible in short time.

thegreystone commented 5 years ago

A separate API is OK for me. That said, if the tooling could take unmodified JPLIS agents, place them in a little sandbox and throw all classes to the premain (and also intercept calls to Unsafe#defineClass etc for any helper classes generated), that would probably allow many existing agents to work with graal without much modification. Harder to accomplish and maintain though.

vjovanov commented 5 years ago

I am adding an agent to native-image as we speak; I am a bit afraid of what happens when these agents start interacting. Things that worry me are:

The order of agents. For example, my agent adds a call in every static initializer at the beginning. What if the agent of someone else adds something before.
Agents can modify the code of native-image itself. This includes very sensitive things such as the garbage collector and deptimization routines. How do we assure that the agent will not mess with the native-image internals.
Agents can introduce code that is generated after native image agents. For example, one of the agents will rewrite invokedynamic instructions for lambdas. If the user-space agent introduces a lambda, all our assumptions are broken.

With all of this said, it would be good to have examples of the agents that you want to use. It would give me a better idea of how to support this. I would like to see quite a few use-cases before exposing this in the API.

I think you could try the agents already: If you pass -J-javaagent:<agent-jar> it should work out of the box, given agents do reasonable things.

I don't think we will ever be able to guarantee that native image will work with agents that modify classes of the native image itself. Let's see how we can support this elegantly.

mwilson-cb commented 5 years ago

Datadog APM is one agent that has issues.

vjovanov commented 5 years ago

I have done an agent for tracing class initialization and based on the experience I believe we need to: 1) Let all user agents run before the native-image ones. 2) Restrict the scope of agents only to classes loaded by the NativeImageClassLoader. Our native-image code has various assumptions about the shape of the bytecode (e.g., some parts of the code must not allocate), so any agent will not run. However, if the agent is restricted to user-space all should work.

Is it possible to restrict the scope of the Datadog APM?

mwilson-cb commented 5 years ago

Not sure but here is the Datadog APM Java agent code https://github.com/DataDog/dd-trace-java

raphw commented 5 years ago

Most such agents target mainly user space but also some JVM-specific classes, often those responsible for context switching such as Thread or ThreadPoolExecutor.

You are saying that passing an agent with -J-javaagent:<agent-jar> to the native image compiler should basically instrument the targeted image?

vjovanov commented 5 years ago

Yes, it will if it does not break the image builder. I will merge soon the PR that does transformations for the Java lambdas. This will be a good starting point for adding extra agents.

adityasundaramurthy-maersk commented 3 years ago

Sorry for resurrecting this zombie thread, but has anyone managed to get agents packaged into the native image? I'm also looking at instrumenting application code with the Datadog javaagent to capture telemetry. @mwilson-cb @vjovanov

vjovanov commented 3 years ago

Until we separate the user code from the image builder code this feature can't become officially supported (this is a long-term project with uncertain date of completion). You can always try this at your own risk by simply including the agent in the image builder and see what happens.

It worked well for the lambda-rewriting and it will work well for many agents.

raphw commented 3 years ago

I am wondering about a thing here: If I use a Java agent in the build that transforms classes of the core libraries which are GPL licensed, would I need to provide sources (byte code) of the transformed methods as those would be considered derivatives of the original classes that are not covered by the class path exception?

binoysankar commented 3 years ago

Anyone has tried including datadog agent in native image build? I am searching for a solution. Couldn't find one.

thomaswue commented 3 years ago

@thegreystone Any thoughts re datadog agent in a native image?

dougqh commented 3 years ago

Yes, we (Datadog) are starting to see increased interest in native image, so we're starting to look into it. However, we don't have any concrete plans just yet.

cemo commented 3 years ago

The problem is that there is no clear efforts on this issue. GraalVM is a big step in the JVM world but without supporting agents how can we move forward?

CC: We are also using DD @dougqh

cforce commented 2 years ago

We also have the challenge that we have no glue howto use DataDog together with graal image

cforce commented 2 years ago

nhandev552 commented 1 year ago

hi @dougqh, Now Datadog agent can work with native image?. I also try to add -J-javaagent In Micronaut project, but build faild

mat-613 commented 1 year ago

Hi all any news on it?

thomaswue commented 1 year ago

So using a bytecode transforming agent at image build time is already working for some time. The only aspect the agent needs to be careful about is not to transform the classes of the image generators in a way that influences the generation process. So the agent needs to work exclusively on user classes.

Another aspect related to this is that we had more discussions how to support the Dynatrace agent in native image and for this we are now starting to implement basic JVMCI functionality for the image itself, such that one can attach a JVMTI agent when starting the generated image.

I don't know which of those approaches would be relevant for DataDog. Maybe @thegreystone can chime in about what specific requirements the support of DataDog in native image would have.

ziyilin commented 9 months ago

This PR https://github.com/oracle/graal/pull/8077 tries to solve this problem.

oracle / graal

Support agents at image generation time #1065