ruby / psych

A libyaml wrapper for Ruby
MIT License
566 stars 206 forks source link

Allow setting defaults for SnakeYAML limits #647

Closed headius closed 1 year ago

headius commented 1 year ago

This adds class-level accessors for the following SnakeYAML-specific parser settings:

The initial values are based on SnakeYAML Engine's defaults. This PR does not modify those default values.

Using these accessors, it should be possible for users to globally change them for all future Psych parser instances without resorting to monkey patches as described in https://github.com/ruby/psych/pull/613#issuecomment-1707712390.

headius commented 1 year ago

@oliverbarnes Please verify if you get a chance!

oliverbarnes commented 1 year ago

Having a look 🙏

oliverbarnes commented 1 year ago

So, I've declared the psych gem with your fork and branch:

gem "psych", github: "headius/psych", branch: "jruby_default_limits"

And I'm getting the following error when bundling:

Fetching gem metadata from https://rubygems.org/.......
Resolving dependencies...
--- ERROR REPORT TEMPLATE -------------------------------------------------------

TypeError: no implicit conversion of nil into String
  org/jruby/RubyFile.java:1047:in `join'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/jars/installer.rb:212:in `do_install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/jars/installer.rb:170:in `vendor_jars'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/jars/post_install_hook.rb:28:in `block in <main>'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path/installer.rb:43:in `block in run_hooks'
  org/jruby/RubyArray.java:1989:in `each'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path/installer.rb:42:in `run_hooks'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path/installer.rb:34:in `post_install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path.rb:244:in `generate_bin'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/git.rb:194:in `install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/gem_installer.rb:54:in `install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/gem_installer.rb:16:in `install_from_spec'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/parallel_installer.rb:156:in `do_install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/parallel_installer.rb:147:in `block in worker_pool'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:62:in `apply_func'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:57:in `block in process_queue'
  org/jruby/RubyKernel.java:1601:in `loop'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:54:in `process_queue'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:90:in `block in create_threads'

The canonical psych gem bundles fine.

I'm still investigating if there's something on our end that might cause this somehow, but thought it was worth letting you know.

headius commented 1 year ago

So what you're saying I should actually test this, eh? 😝

I'll make some fixes.

headius commented 1 year ago

Oh actually I see the problem... JRuby doesn't support installing from Github because we don't build the extension at install time, we build it at build time. So if you build the gem from the fork and install it you should be able to activate it and try it out.

oliverbarnes commented 1 year ago

Ok, cloned the fork and installed the gem from the PR branch:

rake compile:psych
rake install

and configured my app's gemfile to use it.

gem "psych", "5.1.0", platform: :jruby

On the initializer for psych I declared

# I might be misreading PsychParser.java, though
Psych::Parser.code_point_limit = 20_000_000

And when running a spec, I get the following error now:

➜  myapp git:(spike-jruby) ✗ be rspec spec/my_spec.rb
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/date.rb:471: warning: previous definition of strptime was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/date.rb:490: warning: previous definition of parse was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/date.rb:737: warning: previous definition of parse was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/concurrent-ruby-1.1.10/lib/concurrent-ruby/concurrent/executor/java_thread_pool_executor.rb:13: warning: method redefined; discarding old to_int
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/concurrent-ruby-1.1.10/lib/concurrent-ruby/concurrent/executor/java_thread_pool_executor.rb:13: warning: method redefined; discarding old to_f
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/nokogiri-1.15.3-java/lib/nokogiri/xml/node.rb:1007: warning: method redefined; discarding old attr
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bindata-2.4.15/lib/bindata/base.rb:80: warning: previous definition of initialize was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/ruby-debug-0.11.0/cli/ruby-debug/commands/list.rb:36: warning: `-' after local variable or literal is interpreted as binary operator
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/ruby-debug-0.11.0/cli/ruby-debug/commands/list.rb:36: warning: even though it seems like unary operator
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/zeitwerk-2.6.11/lib/zeitwerk/kernel.rb:38: warning: parentheses after method name is interpreted as an argument list, not a decomposed argument

An error occurred while loading spec/my_spec.rb. - Did you mean?
                    <snip>

Failure/Error: require File.expand_path("../config/environment", __dir__)

LoadError:
  load error: ~/myapp/config/environment -- java.lang.ClassCastException: class java.lang.Integer cannot be cast to class org.jruby.runtime.builtin.IRubyObject (java.lang.Integer is in module java.base of loader 'bootstrap'; org.jruby.runtime.builtin.IRubyObject is in module org.jruby.dist of loader 'app')
# ./spec/rails_helper.rb:6:in `<main>'
# ./spec/support/requests_helper.rb:3:in `<main>'
# ./spec/my_spec.rb:3:in `<main>'
# ------------------
# --- Caused by: ---
# Java::JavaLang::ClassCastException:
#   class java.lang.Integer cannot be cast to class org.jruby.runtime.builtin.IRubyObject (java.lang.Integer is in module java.base of loader 'bootstrap'; org.jruby.runtime.builtin.IRubyObject is in module org.jruby.dist of loader 'app')
#   org.jruby.ext.psych.PsychParser.<init>(PsychParser.java:149)
headius commented 1 year ago

@oliverbarnes Ooops yup that was on me. I originally planned on storing the non-Ruby object version of these values, and then changed my mind. Unfortunately I didn't change all of the code.

Try it with latest version (force pushed)!

oliverbarnes commented 1 year ago

:) seems to be working now. No environment load error, nor any code limit error. I'm now hitting that error from my use case:

VCR::Errors::UnhandledHTTPRequestError: 

described on https://github.com/ruby/psych/issues/648

That's for code_point_limit of course - I haven't checked the other config options:

So what you're saying I should actually test this, eh? 😝

not a bad idea 😄 a unit test would be nice, at least

headius commented 1 year ago

I added a test for code_point_limit.

I tried to add a test for allow_duplicate_keys but I couldn't get it to fail. 🤔

Still looking for good examples for recursive keys and alias limits.

headius commented 1 year ago

@asomov Can you point us toward some simple tests for the following settings? I tried to find something in the SnakeYAML Engine repo but have not found anything.

I also tried parsing a simple duplicate key script and it did not error, no matter what setting I provided.

foo: bar
foo: bar

Should this be rejected?

oliverbarnes commented 1 year ago

I added a test for code_point_limit.

👍 thanks, test looks good to me

I tried to add a test for allow_duplicate_keys but I couldn't get it to fail. 🤔

Still looking for good examples for recursive keys and alias limits.

Maybe worth creating separate PRs?

headius commented 1 year ago

Maybe worth creating separate PRs?

Yeah let's go with that idea.