middleman / middleman

Hand-crafted frontend development
https://middlemanapp.com
MIT License
7.05k stars 748 forks source link

`middleman build --track-dependencies --missing-and-changed` fails on GitLab.com site build with latest master #2228

Closed smcgivern closed 5 years ago

smcgivern commented 5 years ago

I can't give steps from a clean install, but if you check out https://gitlab.com/gitlab-com/www-gitlab-com/commit/e769a6745bab59096c159a0437f16f48dd0dd1eb you will see this failure when running bundle exec middleman build --track-dependencies --missing-and-changed:

/builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/core_extensions/data/proxies/base.rb:25:in `method_missing': undefined method `each_key' for # (NoMethodError)
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/core_extensions/data/proxies/hash.rb:22:in `method_missing'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/core_extensions/collections/lazy_step.rb:37:in `value'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/core_extensions/collections.rb:105:in `block (2 levels) in manipulate_resource_list_container!'
    from /usr/local/lib/ruby/2.4.0/set.rb:324:in `each_key'
    from /usr/local/lib/ruby/2.4.0/set.rb:324:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/core_extensions/collections.rb:104:in `block in manipulate_resource_list_container!'
    from /usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/core_extensions/collections.rb:87:in `manipulate_resource_list_container!'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/sitemap/store.rb:181:in `block (4 levels) in ensure_resource_list_updated!'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/activesupport-5.1.6.1/lib/active_support/notifications.rb:168:in `instrument'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/util.rb:21:in `instrument'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/sitemap/store.rb:177:in `block (3 levels) in ensure_resource_list_updated!'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/hamster-3.0.0/lib/hamster/vector.rb:1316:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/hamster-3.0.0/lib/hamster/vector.rb:1316:in `traverse_depth_first'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/hamster-3.0.0/lib/hamster/vector.rb:431:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/sitemap/store.rb:176:in `block (2 levels) in ensure_resource_list_updated!'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/activesupport-5.1.6.1/lib/active_support/notifications.rb:168:in `instrument'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/util.rb:21:in `instrument'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/sitemap/store.rb:169:in `block in ensure_resource_list_updated!'
    from /usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/sitemap/store.rb:166:in `ensure_resource_list_updated!'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/sitemap/extensions/on_disk.rb:21:in `ready'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/callback_manager.rb:57:in `instance_exec'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/callback_manager.rb:57:in `block in execute'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/hamster-3.0.0/lib/hamster/vector.rb:1316:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/hamster-3.0.0/lib/hamster/vector.rb:1316:in `traverse_depth_first'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/hamster-3.0.0/lib/hamster/vector.rb:431:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/callback_manager.rb:57:in `execute'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/callback_manager.rb:28:in `block in install_methods!'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/application.rb:305:in `initialize'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-cli/lib/middleman-cli/build.rb:78:in `new'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-cli/lib/middleman-cli/build.rb:78:in `block in build'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/activesupport-5.1.6.1/lib/active_support/notifications.rb:168:in `instrument'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-core/lib/middleman-core/util.rb:21:in `instrument'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-cli/lib/middleman-cli/build.rb:74:in `build'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `block in invoke_all'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `map'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `invoke_all'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/group.rb:232:in `dispatch'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:115:in `invoke'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:40:in `block in register'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-c6898b8c2875/middleman-cli/bin/middleman:64:in `'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bin/middleman:23:in `load'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bin/middleman:23:in `
'

The full CI output is at: https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/138475784

I don't totally understand what's going on here, but I did notice Set show up in the backtrace, and Set#each uses Hash#each_key internally, so that might be a clue if the Set is somehow also using the hash proxy.

Using 067dd350c8ccb0b40ded53b50cbd003a04cb0252 (which was before https://github.com/middleman/middleman/pull/2222) seems to work, although I've been waiting for quite a long time for the build to actually finish and give me my deps.yml.

tdreyno commented 5 years ago

Added each_key to the white list.

smcgivern commented 5 years ago

Thanks! I now have the same problem with #each:

$ bundle exec middleman build --track-dependencies --missing-and-changed
== Preferring use of LibSass
== Blog Sources: posts/{year}-{month}-{day}-{title}.html (:prefix + :sources)
bundler: failed to load command: middleman (/Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bin/middleman)
NoMethodError: undefined method `each' for #<Middleman::CoreExtensions::Data::Proxies::HashProxy:0x00007fddd513bbf8>
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/data/proxies/base.rb:25:in `method_missing'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/data/proxies/hash.rb:22:in `method_missing'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/collections/lazy_step.rb:37:in `value'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/collections.rb:105:in `block (2 levels) in manipulate_resource_list_container!'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/2.4.0/set.rb:324:in `each_key'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/2.4.0/set.rb:324:in `each'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/collections.rb:104:in `block in manipulate_resource_list_container!'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/collections.rb:87:in `manipulate_resource_list_container!'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_reference.rb:43:in `send_to'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/call_with.rb:79:in `call_with'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/sitemap/store.rb:185:in `block (4 levels) in ensure_resource_list_updated!'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/activesupport-5.1.6.1/lib/active_support/notifications.rb:168:in `instrument'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/util.rb:21:in `instrument'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/sitemap/store.rb:181:in `block (3 levels) in ensure_resource_list_updated!'
# ...
smcgivern commented 5 years ago

If I assume that we will also add #each to the whitelist, then I get a similar message but in our application's code:

NoMethodError: undefined method `url' for #<Middleman::CoreExtensions::Data::Proxies::HashProxy:0x00007fefd284ec60>
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/data/proxies/base.rb:26:in `method_missing'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-d3c527d45919/middleman-core/lib/middleman-core/core_extensions/data/proxies/hash.rb:22:in `method_missing'
  /Users/seanmcgivern/Code/www-gitlab-com/config.rb:147:in `block in evaluate_configuration!'

Our config there looks like this:

data.events.each do |event|
  next unless event.url

  proxy "/events/#{event.url.tr(' ', '-')}/index.html", '/events/template.html', locals: {
    event: event
  }, ignore: true
end

We're expecting this to be a Middleman::Util::EnhancedHash, which in turn is a Hashie::Mash, allowing keys to be accessed like methods. However, we're also relying on event.some_missing_key to return nil, not to blow up. I can address this on our side by replacing the line with next unless event.key?(:url), but we have a lot of places where we do this, so I can see this being a big pain point on upgrading.

smcgivern commented 5 years ago

Oh, and Array#each_with_index is also missing (although that could also be fixed on the calling side by doing .each.with_index).

tdreyno commented 5 years ago

Thanks, I’ll look into making sure the remaining methods are correctly proxied.

I think I made sure the data was indifferent in templates, but may have not done that to config.

smcgivern commented 5 years ago

I think I made sure the data was indifferent in templates, but may have not done that to config.

Hmm, I think after I changed the config I also saw this in templates too, but I can't remember for sure.

tdreyno commented 5 years ago

Ok, it should be allowing indifferent access, but it will throw instead of nil on missing keys, I think. I'll change that.

smcgivern commented 5 years ago

Another couple of issues, sorry 🙂

to_json

We call #to_json on data files, and I had to hack around an issue here with dependency tracking: https://gitlab.com/gitlab-com/www-gitlab-com/commit/8eae8095459a5a2fa2a578bba036bc6d4d103782

Two things I noticed from that:

  1. #to_json doesn't appear to create a dependency.
  2. Testing this out on a sample project, with these additions:

    diff --git a/data/foo.yml b/data/foo.yml
    new file mode 100644
    index 0000000..083c5ae
    --- /dev/null
    +++ b/data/foo.yml
    @@ -0,0 +1,2 @@
    +a: 1
    +b: 2
    diff --git a/source/foo.json.erb b/source/foo.json.erb
    new file mode 100644
    index 0000000..8361f68
    --- /dev/null
    +++ b/source/foo.json.erb
    @@ -0,0 +1,3 @@
    +{
    +  "foo": <%= data.foo.to_json %>
    +}

    I don't get a stack overflow, but foo.json ends up with these contents, which are still wrong:

    {
      "foo": "#<Middleman::CoreExtensions::Data::Proxies::HashProxy:0x00007fa61fe95350>"
    }

Array#slice

Ranges seem to be handled incorrectly: https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/139574700

       error  public/press/index.html
no implicit conversion of Range into Integer
/builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/core_extensions/data/proxies/array.rb:28:in `slice'
/builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/core_extensions/data/proxies/array.rb:28:in `slice'
/builds/gitlab-com/www-gitlab-com/source/press/index.html.haml:23:in `block in render'

(I think that https://github.com/middleman/middleman/blob/master/middleman-core/lib/middleman-core/core_extensions/data/proxies/array.rb#L28 shouldn't pass length at all.)

smcgivern commented 5 years ago

And another one: Array#{first,last} don't work with the optional argument when proxied. (https://ruby-doc.org/core-2.4.4/Array.html#method-i-last)

smcgivern commented 5 years ago

Working around the above issues, I now get: https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/139591206

/builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/dependencies/vertices/data_collection_path_vertex.rb:61:in `[]': no implicit conversion of Symbol into Integer (TypeError)
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/dependencies/vertices/data_collection_path_vertex.rb:61:in `block in lookup_path'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/dependencies/vertices/data_collection_path_vertex.rb:60:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/dependencies/vertices/data_collection_path_vertex.rb:60:in `reduce'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/dependencies/vertices/data_collection_path_vertex.rb:60:in `lookup_path'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-4c2b89bf41ef/middleman-core/lib/middleman-core/dependencies/vertices/data_collection_path_vertex.rb:48:in `current_hash'
# ...

This is promising, in as much as I'm pretty sure that means the site actually built, it just failed to write out the deps?

tdreyno commented 5 years ago

This is why I need a real world app to test against. This is great :)

tdreyno commented 5 years ago

I cannot reproduce the to_json issue on master. Do you have a small test case repo I could look at?

tdreyno commented 5 years ago

N/m. found it.

tdreyno commented 5 years ago

Fixed first/last... god, Ruby can be gross sometimes. An optional param that completely changes. the return type? woof.

tdreyno commented 5 years ago

Fixed the symbol coercion issue as well.. but it made another issue clear that I need to look in to

smcgivern commented 5 years ago

Thanks @tdreyno, let me know when I can try again.

tdreyno commented 5 years ago

Added a test suite for this stuff. Should be good to test again (and now I can track your issues in tests :)

smcgivern commented 5 years ago

Current master fails in this way:

/builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:459:in `dump': singleton can't be dumped (TypeError)
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:459:in `process_incoming_jobs'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:437:in `block in worker'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:428:in `fork'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:428:in `worker'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:419:in `block in create_workers'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:418:in `each'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:418:in `each_with_index'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:418:in `create_workers'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:358:in `work_in_processes'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.12.1/lib/parallel.rb:264:in `map'
    from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-d6cbe0f4b715/middleman-core/lib/middleman-core/builder.rb:201:in `output_resources'

https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/141454243

I'm trying again with --no-parallel.

smcgivern commented 5 years ago

OK, that worked! (Although obviously with a performance penalty.) https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/141462643

Building after changing a data file was a little weird: it did succeed, but actually took longer than a full build, and seemed to build everything anyway? https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/141478928

I'm going to see if I can get the deps.yml file to see why that happened.

smcgivern commented 5 years ago

@tdreyno is there a way to exclude a directory from the deps tracking? Our deps.yml is nearly 15 MB, and 278,150 lines. I noticed that we seem to include vendor/ (which is where we install our gems in CI):

$ grep vendor ~/Downloads/deps.yml | wc -l
    9480

You can see the whole file at https://gitlab.com/gitlab-com/www-gitlab-com/uploads/1f76fe36b520baf23dbaf91eda82354a/deps.yml.zip

smcgivern commented 5 years ago

We also don't seem to have any data files tracked at all?

smcgivern commented 5 years ago

The above make me think I'm doing something wrong. Unfortunately, it's very hard to extract a minimal test-case out of this repo because it's just so large.

tdreyno commented 5 years ago

No worries. These are great. I'm surprised by the parallel issue.

I'll look into the vendor one. Because Bundler can put them anywhere, it might be a little weird.

tdreyno commented 5 years ago

Ah, the code that supposed to track lib and helpers is also picking up vendor and spec. I'll constrain it to just the known folders for MM.

It also seems like the data file path tracking might be worse than just tracking the entire file changing.

tdreyno commented 5 years ago

Okay, ignoring the vendor folder now. If you can re-gen, I want to take a deeper look at the data paths now.

smcgivern commented 5 years ago

Thanks. Now I have no deps.yml, I get:

bundler: failed to load command: middleman (/Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bin/middleman)
NoMethodError: undefined method `each' for nil:NilClass
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/hamster-3.0.0/lib/hamster/set.rb:81:in `initialize'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/hamster-3.0.0/lib/hamster/immutable.rb:14:in `new'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/hamster-3.0.0/lib/hamster/immutable.rb:14:in `new'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-11dab299f4d7/middleman-core/lib/middleman-core/dependencies/graph.rb:113:in `invalidated'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_reference.rb:43:in `send_to'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/call_with.rb:79:in `call_with'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-11dab299f4d7/middleman-core/lib/middleman-core/builder.rb:75:in `run!'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_reference.rb:43:in `send_to'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/call_with.rb:79:in `call_with'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-11dab299f4d7/middleman-cli/lib/middleman-cli/build.rb:109:in `block in build'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/activesupport-5.2.2/lib/active_support/notifications.rb:170:in `instrument'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-11dab299f4d7/middleman-core/lib/middleman-core/util.rb:21:in `instrument'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-11dab299f4d7/middleman-cli/lib/middleman-cli/build.rb:108:in `build'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `block in invoke_all'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `each'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `map'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `invoke_all'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/group.rb:232:in `dispatch'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:115:in `invoke'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor.rb:40:in `block in register'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-11dab299f4d7/middleman-cli/bin/middleman:64:in `<top (required)>'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bin/middleman:23:in `load'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bin/middleman:23:in `<top (required)>'

I can work around this by removing --missing-and-changed from the command, but I think it's pretty natural to always run with that, whether you have the deps already tracked or not.

smcgivern commented 5 years ago

@tdreyno I got a new deps.yml, we're down to a mere 13 MB 😅

https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/143497052/artifacts/raw/deps.yml

EDIT: I don't see any files from data/ in there.

tdreyno commented 5 years ago

Ah, cool, let's avoid --missing-and-changed for now (but this will be VERY necessary on CI). I haven't added any tests for that flag yet, but I will.

So, right now data is tracked using a generic mechanism that can track arbitrary Hashes and Arrays. This means it can be used for remote data or variables in config.rb as well.

At the bottom of the dips, you'll see:

- :key: :webcasts.7.youtube_url
  :type: :data_collection_path
  :attributes:
    :hash: 783073cc5c38e373475e15108ce369fbfb3d6e98

Which means that the data/websites.yml had an array which index 7 and key youtube_url was depended on by source/webcast/template.html.haml and source/webcast/index.html.haml.

It looks like in your case, there's a bit of data, a bunch of templates and pretty much everything depends on most everything (especially site.yml, features.yml, roles.yml, release_posts.yml). So the fine grain dependency map is pretty large and noisy.

I do have a mode that maps more simply from YAML file -> template, but I'm not sure I want to go that. direction yet. Of the the use cases I'm trying to hit is "Edit 1 Person model in Contentful should only rebuild the person detail and a person index."

I may try a hybrid, which is that the tracking only does 1 level of depth. So, a list of people will solve the above, but if those people each have an address, then anything with any of the addresses will rebuild.

When removing all of the access-specific data deps, the file is 4.7mb.

The edges are 1 item per pair, but the keys are the same for each value. I can change the data model to avoid that duplication (results in 65% savings, down to 1.6 mb). It appears you have 2198 template files in your project, which is pretty impressive. 1.6mb for that large of a site doesn't feel too bad.

I'll let you know as I:

tdreyno commented 5 years ago

The initial hash time of 2198 files might suck. I'm thinking of other ways... You all are git experts, is there anyway to use the hash git already has calculated? When I played with it, I couldn't tell if it was hashing in real time or from some cache.

smcgivern commented 5 years ago

You all are git experts, is there anyway to use the hash git already has calculated?

git rev-parse HEAD:$path will show you the SHA of the blob at that path. That depends on:

  1. The path being in the repo (i.e. not generated by something else).
  2. This being a git repo in the first place.

It is not the same as the SHA of the contents because the blob has a header. If that's not a problem, and is significantly faster, that might help? You could also use the tree SHAs to see if a whole directory has changed.


I think the plans above to reduce depth are good. The data model is really cool, I didn't realise the dependency tracking was so advanced! Unfortunately, as you noticed, the GitLab site's 'architecture' is basically the big-ball-of-mud model.

tdreyno commented 5 years ago

Tried on a repo locally with 647 files

shasum: 0m10.301s rev-parse: 0m0.020s

So that's promising. I already have the hashing abstracted, so it could prefer git when available. Does rev-parse work if there are uncommitted changes?

tdreyno commented 5 years ago

Actually, I can just check... and I did. Doesn't work with uncommitted. Still, good route to explore.

tdreyno commented 5 years ago

Okay, adding new flag --data-collection-depth=1 here: https://github.com/middleman/middleman/pull/2239

Right now it defaults to Infinity. I'll probably default to 2 once it's working well. Or 1, not sure. Need to update old tests either way.

tdreyno commented 5 years ago

After that merges, I'll tackle the content model to save file size, but not likely time.

Then, we can play with improved hashing speeds next week.

Thanks again for the amazing test data!

tdreyno commented 5 years ago

Merged depth PR. Finishing up deps format update. ☝️

tdreyno commented 5 years ago

Once the above merges, I'd love to re-time the following command on your system. Then we can use that as the benchmark for the Hashing changes.

middleman build --track-dependencies --data-collection-depth=1 --only-changed
tdreyno commented 5 years ago

Added a env flag MIDDLEMAN_SHELL_OUT_TO_GIT_HASH=true to shell out to git for file SHAs. Worth experimenting with for timings.

smcgivern commented 5 years ago

On our CI, I ran this command: time bundle exec middleman build --data-collection-depth=1 --only-changed --track-dependencies --verbose --no-parallel

--verbose and --no-parallel were there for debugging, so I can remove them if you think it will be a material difference. Unfortunately, I just got this:

# ...
== Rebuilding resource list
== Running manipulator: sitemap_ondisk (0)
== Running manipulator: sitemap_endpoint (0)
== Running manipulator: sitemap_proxies (0)
== Running manipulator: sitemap_redirects (0)
== Running manipulator: sitemap_ignore (0)
== Running manipulator: sitemap_import (1)
== Running manipulator: routing (10)
== Running manipulator: front_matter (20)
== Running manipulator: blog_blog1_articles (50)
== Running manipulator: blog_blog1_categories (50)
== Running manipulator: minify_css (50)
== Running manipulator: minify_javascript (50)
== Running manipulator: sitemap_move_files (101)
== Running manipulator: collections (110)
== Running manipulator: routing (130)
bundler: failed to load command: middleman (/builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bin/middleman)

Which is puzzling. This is using middleman at 55b1e629af8254d45fd8216257eaa4d26dfa0e16. I see something similar locally, too:

NoMethodError: undefined method `each' for nil:NilClass
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/hamster-3.0.0/lib/hamster/set.rb:81:in `initialize'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/hamster-3.0.0/lib/hamster/immutable.rb:14:in `new'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/hamster-3.0.0/lib/hamster/immutable.rb:14:in `new'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-55b1e629af82/middleman-core/lib/middleman-core/dependencies/graph.rb:126:in `invalidated'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_reference.rb:43:in `send_to'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/call_with.rb:79:in `call_with'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-55b1e629af82/middleman-core/lib/middleman-core/builder.rb:79:in `run!'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_reference.rb:43:in `send_to'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/call_with.rb:79:in `call_with'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/contracts-0.16.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-55b1e629af82/middleman-cli/lib/middleman-cli/build.rb:115:in `block in build'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/activesupport-5.2.2/lib/active_support/notifications.rb:170:in `instrument'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-55b1e629af82/middleman-core/lib/middleman-core/util.rb:21:in `instrument'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-55b1e629af82/middleman-cli/lib/middleman-cli/build.rb:114:in `build'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `block in invoke_all'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `each'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `map'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `invoke_all'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/group.rb:232:in `dispatch'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:115:in `invoke'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor.rb:40:in `block in register'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bundler/gems/middleman-55b1e629af82/middleman-cli/bin/middleman:64:in `<top (required)>'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bin/middleman:23:in `load'
  /Users/seanmcgivern/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/bin/middleman:23:in `<top (required)>'

Do I need an existing deps.yml for --only-changed to work?

tdreyno commented 5 years ago

I does look like that is the case. Fixing now.

tdreyno commented 5 years ago

Okay, that should fix the explosion. But, if you can omit --only-changed for now, I'm more curious about initial build time and the resulting deps.yml.

time bundle exec middleman build --no-parallel --verbose --track-dependencies

And if that succeeds:

MIDDLEMAN_SHELL_OUT_TO_GIT_HASH=true time bundle exec middleman build --no-parallel --verbose --track-dependencies
smcgivern commented 5 years ago

@tdreyno here are timings and files from our CI (I added --data-collection-depth=1 back):

  1. Without env var, 16m12.168s, deps.yml
  2. With MIDDLEMAN_SHELL_OUT_TO_GIT_HASH, 14m49.467s, deps.yml

However, there was an issue with the second run:

fatal: Cannot open '/builds/gitlab-com/www-gitlab-com/source/posts/2015-09-17-gitlab-announces-M-series-a-funding-from-khosla-ventures.html.md': No such file or directory
sh: 1: Syntax error: Unterminated quoted string
sh: 1: Syntax error: Unterminated quoted string
Project built successfully.

This file is actually source/posts/2015-09-17-gitlab-announces-$4M-series-a-funding-from-khosla-ventures.html.md. Having a dollar sign in a filename is asking for trouble, but I guess you'll find this stuff in the real world. I will try without the env var and with parallel, because I suspect that the gains from the parallel output are bigger and the git hashing probably isn't so valuable, at least for us.

smcgivern commented 5 years ago

Oh, parallel failed again.

Stacktrace ``` /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:462:in `dump': singleton can't be dumped (TypeError) from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:462:in `process_incoming_jobs' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:440:in `block in worker' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:431:in `fork' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:431:in `worker' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:422:in `block in create_workers' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:421:in `each' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:421:in `each_with_index' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:421:in `create_workers' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:361:in `work_in_processes' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:267:in `map' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-core/lib/middleman-core/builder.rb:205:in `output_resources' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-core/lib/middleman-core/builder.rb:175:in `output_files' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-core/lib/middleman-core/builder.rb:111:in `block in run!' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/activesupport-5.2.2/lib/active_support/notifications.rb:170:in `instrument' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-core/lib/middleman-core/util.rb:21:in `instrument' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-core/lib/middleman-core/builder.rb:110:in `run!' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-cli/lib/middleman-cli/build.rb:115:in `block in build' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/activesupport-5.2.2/lib/active_support/notifications.rb:170:in `instrument' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-core/lib/middleman-core/util.rb:21:in `instrument' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-cli/lib/middleman-cli/build.rb:114:in `build' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `block in invoke_all' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `each' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `map' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:133:in `invoke_all' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/group.rb:232:in `dispatch' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:115:in `invoke' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:40:in `block in register' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bundler/gems/middleman-593e8df7f6a7/middleman-cli/bin/middleman:64:in `' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bin/middleman:23:in `load' from /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bin/middleman:23:in `' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/cli/exec.rb:74:in `load' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/cli/exec.rb:74:in `kernel_load' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/cli/exec.rb:28:in `run' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/cli.rb:463:in `exec' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/cli.rb:27:in `dispatch' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/cli.rb:18:in `start' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/exe/bundle:30:in `block in ' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/lib/bundler/friendly_errors.rb:124:in `with_friendly_errors' from /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.17.3/exe/bundle:22:in `' from /usr/local/bin/bundle:23:in `load' from /usr/local/bin/bundle:23:in `
' WARNING: V8 isolate was forked, it can not be disposed and memory will not be reclaimed till the Ruby process exits. bundler: failed to load command: middleman (/builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/bin/middleman) Parallel::DeadWorker: Parallel::DeadWorker /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:74:in `rescue in work' /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:71:in `work' /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:385:in `block (4 levels) in work_in_processes' /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:498:in `with_instrumentation' /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:384:in `block (3 levels) in work_in_processes' /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:372:in `loop' /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:372:in `block (2 levels) in work_in_processes' /builds/gitlab-com/www-gitlab-com/vendor/ruby/2.4.0/gems/parallel-1.13.0/lib/parallel.rb:206:in `block (2 levels) in in_threads'
tdreyno commented 5 years ago

Cool, I'll loop back to parallel once some of this nitty gritty is working well.

For context, what's a normal clean build time with Middleman 4.x?

tdreyno commented 5 years ago

The YAML is down to 2.5mb, that's nice. I think --data-collection-depth=0 might even work better, but we'll have to see. 0 would make it only rebuild if the originating data yaml file changed.

tdreyno commented 5 years ago

With MIDDLEMAN_SHELL_OUT_TO_GIT_HASH, I am definitely very lazily shelling out right now, totally makes sense that a level of escaping will be necessary for non-ascii characters.

Saving ~2 minutes ain't nothing, but not the huge boon I was hoping for.

smcgivern commented 5 years ago

For context, what's a normal clean build time with Middleman 4.x?

I disabled parallelisation (otherwise it's not a fair comparison) on our master branch and I got 11m37.131s. So it's pretty close!

tdreyno commented 5 years ago

Great. I'm going to fix up parallelization next (and escape the git hash stuff).

Maybe on Tuesday, if you've got the time, we could find a representative change from your repo. Either "updated a handful of templates" or "wrote a blog post" or "added an item to the events data." And see how those do with only-changed on.

smcgivern commented 5 years ago

I actually have been trying that before, we just never got this close! I will give that a go next week, yeah. (Monday's not a holiday where I am anyway 🙂)

smcgivern commented 5 years ago

Just to clarify, can I run the same command twice, or do I have to omit --missing-and-changed the first time round?

tdreyno commented 5 years ago

I haven't checked with --missing-and-changed yet. --only-changed should be safest for now. Honestly, I should merge the two... the original idea was for this to work without an existing dist to compare against. If dist is cached (CI) then --missing-and-changed makes sense to maintain the cache, --only-changed could work from a clean checkout and would generate more of a delta/diff to rsync somewhere.

But, back to the original question, it should be safe to always include the X-changed flag, even if deps.yml doesn't exist yet. That said, I've been using either/or in my local tests, so that might not be 100% yet.