cloudfoundry / java-buildpack

Cloud Foundry buildpack for running Java applications
Apache License 2.0
437 stars 2.58k forks source link

Support for user specified MaxPermGen and MaxMetaSpace without fork #122

Closed youngm closed 9 years ago

youngm commented 9 years ago

Over the last couple of years we've gathered a lot of user experience with the java-buildpack and its fork to customize approach. For the most part this approach has worked well and usually when faced with the choice to fork or just go with the default behaviour our users realize the default behaviour is good enough for them.

However, one issue that seems to frequently pop up is the issue of perm gen tuning. It is difficult to find a set of permgen settings that works for most smaller apps but don't waste memory for apps that require more permgen.

We could provide our users with small and large permgen forks but we already provide our users with standard jdk7 and jdk8 forks. Given more permgen flexibility would jump our supported forks from 2 to 4 and potentially add complexity for most of our users that don't need that complexity.

I'd like to propose an option, for apps that need it, to specify a fixed permgen/metaspace size for use in memory heuristics calculations without forking the buildpack. I'm not proposing a universal solution that would give users fine grained control over all of their memory settings. From my experience the only item that really seems to need to be tweaked is Permgen/metaspace all our users are happy to let the buildpack calculate the other memory settings.

I was thinking that users today have the option of specifying JAVA_OPTS via an environment variable. I'd like to propose that if the user supplies an MaxPermGen or MaxMetaSpace option that the buildpack honor that as if it were a fixed metaspace/permgen config parameter supplied to the jvm /config/open_jdk_jre.yml. All other permgen and metaspace parameters would be invalid as user specified values as they are today.

Thoughts?

cgfrost commented 9 years ago

We will have a conversation about this internally, I can't make a call on it myself.

FYI: https://www.pivotaltracker.com/story/show/86290998

cgfrost commented 9 years ago

Hi Mike,

Are you thinking that specifying a minimum permgen/metaspace would be enough or that you want to give an exact value? Also, do you see in the future that you might want to override other memory settings or even other config values. I have some concerns; that we provide a specific solution for metaspace/permgen but then have to generalise it further down the road. I also think it is potentially confusing for users if they do provide an exact and we know it's too high (or low?) to work, do we adjust it and provide some feedback on the console or just let the app go and fail.

No matter what we decide to do I'm in favour of the current behaviour remaining the default so the question is how best to provide additional configuration options.

Chris.

youngm commented 9 years ago

@cgfrost Good points. I was thinking of only allowing the user to specify a fixed maximum permgen. I believe there are certain memory values that are fixed to the application code not the runtime of the application. Permgen is one of those values Xss could be another. Heap is different where the amount of heap an application needs could change with the load of the application not just with the code of the application.

I think it fair to say that there should be a way for an application to specify those values set by code not runtime load without having to fork the buildpack. Of those parameters MaxPermSize is the only one that I have user demand to expose today.

I think it fair for the buildpack to guard against minimums and maximums for these values and error out in the case of obvious error. MaxPermSize, for example, could minimum to a couple of meg and Maximum to some percentage of the total mem configured.

youngm commented 9 years ago

@cgfrost I agree that current behaviour should remain default. My request is that if an app knows that their code needs for something like PermGen there should be a non forking way for them to set it to a fixed value allowing them to more efficiently use the remaining memory as heap.

cgfrost commented 9 years ago

@youngm We are still discussing this but in the mean time how would you feel about a generic solution that allows any buildpack config to be overridden by providing environment variable(s) of some form. I believe this could work at a technical level but I'd like to know what you think about giving users that much power?

youngm commented 9 years ago

@cgfrost I wouldn't have a problem with that level of customizability. I think it is more flexibility than me and my customers have needed in the past, but, I don't really see a problem giving this power to the users either.

This would broaden the buildpack's api. Customers would be angry if I upgrade the buildpack and suddenly their ENV overridden config stops working. I think that would be my main concern.

nebhale commented 9 years ago

@youngm Can you elaborate a bit on that last statement? If we said that values in config/ could be overridden by an ENV that corresponded to the same key, that shouldn't affect ENV config during upgrades, right?

youngm commented 9 years ago

@nebhale It just a comment on how the config keys would become part of the API of the buildpack. If you change a config key from one buildpack version to another you run the risk of messing up an app who has overridden that config key.

You could make this more obvious by prefixing all Env config override with something then alerting the user at stage time if a key they are attempting to override with ENV entries don't presently exist in the buildpack.

lhotari commented 9 years ago

@nebhale I'd like to be able to override config/ files by pushing the modified config files together with the application files without forking the java-buildpack. Using ENV for overriding config key-value pairs seems like a bad solution imho. Is there an existing feature request about this?

nebhale commented 9 years ago

@lhotari It's been discussed on and off for a while and we're thinking about pulling the trigger for real this time. The problem with your solution is that it's a bit "Groovy-centric" :smile:; it doesn't work if you're pushing a JAR unless you want Maven/Gradle to package those files at build time.

nebhale commented 9 years ago

@youngm Great idea to ensure that values are actually overriding something and warn otherwise. We didn't have that in mind, but it's a very elegant way of removing ambiguity.

lhotari commented 9 years ago

@nebhale I wonder if it would be possible to design and implement the java-buildpack to accept a directory (which is pushed with cf-cli) which contains a single war/jar file and a config directory with the overridden config files?

Is this implementation idea groovy-centric somehow?

lhotari commented 9 years ago

the config/ directory on the app side could have some unique name so that the java-buildpack could exclude it from the actual application files that get deployed by the buildpack.

cgfrost commented 9 years ago

Hi @lhotari How would you feel about supplying configuration, that can override the buildpack config, through a CF environment variable. Why do you think this is a bad solution. As environment variables can be set in an applications manifest.yml file it will still be possible to check the overriding config in to source control. I see two advantages of this over using a specially named directory.

  1. All configuration to do with the application on Cloud Foundry can be in one place, the manifest.yml file.
  2. A special directory creates another way of providing config/setup to Cloud Foundry while Environment variables are already established for that purpose.

Other people have suggested using environment variables to configure other aspects of the buildpack, such as the version of Java to use. I'm really keen to avoid special cases and using both environment variables and a special directory as a generic mechanism for overriding config will be confusing imho. Trying to figure out which one wins when there is a conflict and having multiple places to check when debugging etc...

Let me know what you think.

lhotari commented 9 years ago

Hi @cgfrost , I understand your viewpoint. It's a larger change to add another way to configure the buildpack.

The reason I think it's a bad solution to use environment variables is that there isn't a 1-1 mapping to the yaml configuration files by using environment variables. yaml files provide a much broader way to express configuration and support lists and types.

The other advantage is that there would be a 1-1 mapping to the configuration files and values that are present in the config/ directory.

I just think this would be a more simple way to override anything in the buildpack configuration without forking the buildpack.

Perhaps there are other ways to reach the same goal, but I don't think that can be done solely with environment variables.

cgfrost commented 9 years ago

@lhotari We were actually thinking of doing it in a single variable.

JAVA_BUILDPACK_CONFIG="[tomcat:tomcat:version:7.0.+, open_jdk_jre:version:1.7.0_+, open_jdk_jre:memory_heuristics:{heap:75, metaspace:10, stack:5, native:10}]" 

This allows proper yaml notation where the first element must match the name of a config file without the '.yml' extension. This example would change the Tomcat and Java versions and also update the Java memory heuristics. You get a nice mapping to the default config but in a very concise way.

lhotari commented 9 years ago

@cgfrost Thanks, seems like a nice and simple solution for configuring the java buildpack without forking.
how are you planning to parse JAVA_BUILDPACK_CONFIG ?

cgfrost commented 9 years ago

Environment variables are available to the buildpack and there is already have a single point of access for configuration throughout the buildpack. I will extend that code to update the config with any values from the environment variable, I can just load it in with String.to_yaml(). If the environment variable is malformed or contains values that don't exist in the default config a warning will get displayed.

youngm commented 9 years ago

@cgfrost I think I'm less excited by the single environment variable syntax. I think I would prefer a single environment variable per config variable. My experience shows few apps will need to override a single config value let alone 2. I'd think it easier to list those config values as independent environment variables than for me to give my users the config value and have to teach/show them how to include it into the config string. Not a big deal to me but a thought.

cgfrost commented 9 years ago

I guess there is a middle ground of one variable per config file. Something like this for example;

JAVA_BUILDPACK_CONFIG_OPEN_JDK_JRE="memory_heuristics:{heap:75, metaspace:10, stack:5, native:10}"
youngm commented 9 years ago

Env per config entry may help flatten it out a little. It is a difficult balance to strike. I'm sure I'll be fine with whatever you decide.

I'm assuming that:

JAVA_BUILDPACK_CONFIG_OPEN_JDK_JRE="memory_heuristics:{metaspace:256mb..}"

would leave the other memory heuristics untouched?

nebhale commented 9 years ago

@youngm Yes, the plan is that it would be an overlay (and totally punting on how to "remove" an entry). The problem with "single environment variable per config variable" is that as of today we'd need about 80 candidates and many of them would look like JAVA_BUILDPACK_CONFIG__OPEN_JDK_JRE__MEMORY_HEURISTICS__METASPACE=256m.. That's not the worst thing in the world, it just becomes unwieldy if you change a number of them. I am however interested in whether you think we're looking at the tradeoff incorrectly. That perhaps we should optimize for a small number of changes rather than large-scale ones.

youngm commented 9 years ago

@nebhale Good question. For my deployment we currently maintain and upload 2 branches for our users to pick from:

Tomcat 7 vs 8 is interesting because it needs a small change in server.xml file that cannot be made via environment variables.

Because these two packages represent organizational standards we will probably continue maintaining our 2 branches going forward instead of moving to a single branch that users customize with ENV variables.

So, that said the only config I've got requests to tweak as one offs is PermGen and native memory.

For other orgs I can see Java version and command line args perhaps more common. Are there others configs you've got requests to easily tweak?

nebhale commented 9 years ago

@youngm I'll let @cgfrost chime in with the authoritative list, but I believe that it's Java version, memory, and args at this time. I'm worried though that if we add those ENV vars (and continue to use the existing $JAVA_OPTS), at some point in the future we might need a full-fledged configuration system and we'll have to support the one-offs for backwards compatibility.

Since this "full configuration" approach could very obviously be a premature optimization, we're trying to gather as much information before making the full/limited configuration call.

youngm commented 9 years ago

Hey @cgfrost and @nebhale I don't want to be a pest but has any further thought gone into this issue over the last month? I thought we were making good progress for a while.

cgfrost commented 9 years ago

Hi @youngm We have a new design on the backlog to get implemented but other tasks keep jumping in front of it. It will get done but I'm not going to make any promises about when.

youngm commented 9 years ago

@cgfrost thanks for the update. Not in a big rush just curious if it was still on the radar.

youngm commented 9 years ago

Just a quick update. I ran into a situation today where a user wanted to temporarily see access logs on the tomcat side to help debug an issue they were experiencing. It would have been nice to enable tomcat access_logging_support with an environment variable change and restage.

Instead we had to:

So, I guess temporarily enabling tomcat access logging could be another use case for ENV config overriding.

nebhale commented 9 years ago

@youngm Just wanted to let you know that this will be the tentpole feature in the upcoming 3.0 release. Thanks for all your input and understanding while it gestated.