solidusio / solidus

🛒 Solidus, the open-source eCommerce framework for industry trailblazers.
https://solidus.io
Other
5.03k stars 1.29k forks source link

Flaky Feature Spec detected #4110

Closed kennyadsl closed 3 years ago

kennyadsl commented 3 years ago

We currently have a flaky feature spec related to products. It can be inspected here:

https://app.circleci.com/pipelines/github/solidusio/solidus/2519/workflows/b06934e2-9055-44ba-960b-37ed4345939b/jobs/25240

It is happening quite often, see all the latest failures in this list for more occurrences. At a first look, it doesn't seem related to the database used.

gsmendoza commented 3 years ago

@kennyadsl I found that spec/features/products_spec.rb:200 fails when the homepage_products div in frontend/app/views/spree/home/index.html.erb is cached.

I added debug statements to the page in https://github.com/nebulab/solidus/commit/f2a67c1205b82ac101ab25ac72cfacedec091c4e.

Here is the debug output when the spec succeeds:

frontend 09:02:53 $ bundle exec rspec spec/features/products_spec.rb:200 --seed=13987
Run options: include {:locations=>{"./spec/features/products_spec.rb"=>[200]}}

Randomized with seed 13987
cache_key_for_products: en/USD/spree/products/all--20210624010302-9

Ruby on Rails Baseball Jersey details:
Time.current: 2021-06-24 01:03:02 UTC
cache_key: [:en, #<Spree::Variant::PricingOptions:0x0000563b81955b60 @desired_attributes={:currency=>"USD", :country_iso=>nil}>, #<Spree::Product id: 5, name: "Ruby on Rails Baseball Jersey", description: "As seen on TV!", available_on: "2020-06-24 01:03:02.739883000 +0000", deleted_at: nil, slug: "ruby-on-rails-baseball-jersey", meta_description: nil, meta_keywords: nil, tax_category_id: 1, shipping_category_id: 1, created_at: "2021-06-24 01:03:02.748137000 +0000", updated_at: "2021-06-24 01:03:02.755503000 +0000", promotionable: true, meta_title: nil, discontinue_on: nil>]
product.variants.first.images.first:
product.gallery.images.first:

cache_key_for_products: en/USD/spree/products/all--20210624010303-9

Ruby on Rails Baseball Jersey details:
Time.current: 2021-06-24 01:03:03 UTC
cache_key: [:en, #<Spree::Variant::PricingOptions:0x0000563b8208f6d8 @desired_attributes={:currency=>"USD", :country_iso=>nil}>, #<Spree::Product id: 5, name: "Ruby on Rails Baseball Jersey", description: "As seen on TV!", available_on: "2020-06-24 01:03:02.739883000 +0000", deleted_at: nil, slug: "ruby-on-rails-baseball-jersey", meta_description: nil, meta_keywords: nil, tax_category_id: 1, shipping_category_id: 1, created_at: "2021-06-24 01:03:02.748137000 +0000", updated_at: "2021-06-24 01:03:03.216486000 +0000", promotionable: true, meta_title: nil, discontinue_on: nil>]
product.variants.first.images.first: #<Spree::Image:0x0000563b82181ed8>
product.gallery.images.first: #<Spree::Image:0x0000563b82132478>

.

Finished in 1.48 seconds (files took 2.16 seconds to load)
1 example, 0 failures

Randomized with seed 13987

Notice that cache_key_for_products changed in the two visits to spree.root_path:

Here is the debug output when the spec fails:

frontend 08:59:05 $ bundle exec rspec spec/features/products_spec.rb:200
Run options: include {:locations=>{"./spec/features/products_spec.rb"=>[200]}}

Randomized with seed 13987
cache_key_for_products: en/USD/spree/products/all--20210624010128-9

Ruby on Rails Baseball Jersey details:
Time.current: 2021-06-24 01:01:28 UTC
cache_key: [:en, #<Spree::Variant::PricingOptions:0x0000557ea70f28c0 @desired_attributes={:currency=>"USD", :country_iso=>nil}>, #<Spree::Product id: 5, name: "Ruby on Rails Baseball Jersey", description: "As seen on TV!", available_on: "2020-06-24 01:01:28.268546000 +0000", deleted_at: nil, slug: "ruby-on-rails-baseball-jersey", meta_description: nil, meta_keywords: nil, tax_category_id: 1, shipping_category_id: 1, created_at: "2021-06-24 01:01:28.276694000 +0000", updated_at: "2021-06-24 01:01:28.283932000 +0000", promotionable: true, meta_title: nil, discontinue_on: nil>]
product.variants.first.images.first:
product.gallery.images.first:

cache_key_for_products: en/USD/spree/products/all--20210624010128-9

F  HTML screenshot: /home/gsmendoza/repos/solidus/frontend/spec/dummy/tmp/capybara/screenshot_2021-06-24-09-01-28.768.html

Failures:

  1) Visiting Products a product with variants, images only for the variants should not display no image available
     Failure/Error: expect(page).to have_xpath("//img[contains(@src,'blank')]")
       expected to find xpath "//img[contains(@src,'blank')]" but there were no matches
     # ./spec/features/products_spec.rb:200:in `block (3 levels) in <top (required)>'

Finished in 1.43 seconds (files took 2.13 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/features/products_spec.rb:198 # Visiting Products a product with variants, images only for the variants should not display no image available

Randomized with seed 13987

Notice that cache_key_for_products didn't change:

Let's discuss later how to best fix the issue.

kennyadsl commented 3 years ago

Nice catch, since specs are executed in random order, sometimes it won't find the image set because the system cached a version of the page without the image in a previous spec. This explains why the failure only happens sometimes.

To add some context, we have this cache: true option that enables caching in specs. This is defined here, maybe we used some Rails internals that changed in the latest versions and it's not working properly anymore?

gsmendoza commented 3 years ago

@kennyadsl I believe that the cause of the issue might be simpler, though it will require a long explanation :D

These are the relevant parts of the spec from start to finish. Please take note of my comments, marked "GSM":

describe "Visiting Products", type: :feature, inaccessible: true do
  # GSM: This creates the products.
  include_context "custom products"

  # ...

  before(:each) do
    # GSM: Products div is cached on first visit.
    visit spree.root_path
  end

  context "a product with variants, images only for the variants" do
    let(:product) { Spree::Product.find_by(name: "Ruby on Rails Baseball Jersey") }

    before do
      image = File.open(
        File.join(Spree::Core::Engine.root, "lib", "spree", "testing_support", "fixtures", "blank.jpg")
      )

      # GSM: When the variants are created, the product's `updated_at` timestamp is touched. This is because
      # of the `belongs_to :product, touch: true` association in Spree::Variant.
      v1 = product.variants.create!(price: 9.99)
      v2 = product.variants.create!(price: 10.99)

      v1.images.create!(attachment: image)
      v2.images.create!(attachment: image)
    end

    it "should not display no image available" do
      # GSM: On the second visit, the app fetches the cached product div fragment if the "Ruby on Rails
      # Baseball Jersey" product was touched on the exact _second_ it was created.
      visit spree.root_path

      expect(page).to have_xpath("//img[contains(@src,'blank')]")
    end
  end
end

To review, here are the relevant code from the app:

module Spree
  class Variant < Spree::Base
    # GSM: with touch set to true, the product is touched when the variant is updated.
    belongs_to :product, -> { with_discarded }, touch: true, class_name: 'Spree::Product', inverse_of: :variants_including_master, optional: false
  end
end
<div data-hook="homepage_products">
  <%
    puts '----------'
    puts "cache_key_for_products: #{cache_key_for_products}"
    puts "@products.map(&:updated_at).uniq: #{@products.map(&:updated_at).map(&:to_s).uniq}"
  %>

  <% # GSM: This is the cached fragment %>
  <% cache(cache_key_for_products) do %>
    <%= render partial: 'spree/shared/products', locals: { products: @products } %>
  <% end %>

</div>
module Spree
  module ProductsHelper
    def cache_key_for_products
      count = @products.count

      # GSM: the cache key uses the latest update time of the products. `max_updated_at` is rounded to the second e.g. 20210624063831.
      max_updated_at = (@products.maximum(:updated_at) || Date.today).to_s(:number)

      "#{I18n.locale}/#{current_pricing_options.cache_key}/spree/products/all-#{params[:page]}-#{max_updated_at}-#{count}"
    end
  end
end

The spec fails when the cached product div fragment is reused. This happens when the variants of the "Ruby on Rails Baseball Jersey" are created on the exact second when the product was created.

For example, this is the debug output when the spec fails (debug code here). Note that the product update timestamps did not change, which means the variants were created on the same second the product was created:

frontend 14:36:23 $ bundle exec rspec spec/features/products_spec.rb:200
Run options: include {:locations=>{"./spec/features/products_spec.rb"=>[200]}}

Randomized with seed 17091
----------
cache_key_for_products: en/USD/spree/products/all--20210624063831-9
@products.map(&:updated_at).uniq: ["2021-06-24 06:38:30 UTC", "2021-06-24 06:38:31 UTC"]
----------
cache_key_for_products: en/USD/spree/products/all--20210624063831-9
@products.map(&:updated_at).uniq: ["2021-06-24 06:38:30 UTC", "2021-06-24 06:38:31 UTC"]
F  HTML screenshot: /home/gsmendoza/repos/solidus/frontend/spec/dummy/tmp/capybara/screenshot_2021-06-24-14-38-31.463.html

Failures:

  1) Visiting Products a product with variants, images only for the variants should not display no image available
     Failure/Error: expect(page).to have_xpath("//img[contains(@src,'blank')]")
       expected to find xpath "//img[contains(@src,'blank')]" but there were no matches
     # ./spec/features/products_spec.rb:200:in `block (3 levels) in <top (required)>'

Finished in 1.4 seconds (files took 2.09 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/features/products_spec.rb:198 # Visiting Products a product with variants, images only for the variants should not display no image available

Randomized with seed 17091

The spec passes when when the cached product div fragment is busted. This happens when the variants of the "Ruby on Rails Baseball Jersey" are created one second after the product was created:

For example, this is the debug output when the spec succeed. Note that the product update timestamps changed after the variants were created:

rspec ./spec/features/products_spec.rb:198 # Visiting Products a product with variants, images only for the variants should not display no image available

Randomized with seed 56800

frontend 14:39:02 $ bundle exec rspec spec/features/products_spec.rb:200
Run options: include {:locations=>{"./spec/features/products_spec.rb"=>[200]}}

Randomized with seed 2657
----------
cache_key_for_products: en/USD/spree/products/all--20210624063906-9
@products.map(&:updated_at).uniq: ["2021-06-24 06:39:06 UTC"]
----------
cache_key_for_products: en/USD/spree/products/all--20210624063907-9
@products.map(&:updated_at).uniq: ["2021-06-24 06:39:06 UTC", "2021-06-24 06:39:07 UTC"]
.

Finished in 1.43 seconds (files took 2.09 seconds to load)
1 example, 0 failures

Randomized with seed 2657

Here are two ways we can resolve this issue:

  1. Remove the first visit to the root path in the spec. IMO, it's not obvious to find and unnecesarily complicates the feature specs.
  2. Increment the product's updated_at by one second after the variants are created in order to bust the cached fragment.

I intend to go with the first option because it hits two birds with one stone: aside from fixing the spec, it simplifies the whole products_spec suite. Let me know if there is a particular reason why that initial visit exists.

kennyadsl commented 3 years ago

Makes perfect sense and I agree 1. is the way to go. There's no need of visiting the product page twice. But I don't understand one thing: why is caching enabled in specs? As far as I got it should only be enabled when we use the cache: true helper, but maybe I'm wrong... Can you please also investigate this?

gsmendoza commented 3 years ago

@kennyadsl I think I found why the specs are now caching in test.

@waiting-for-dev In the commit's message, you wrote that action_controller.perform_caching was set to true in order to mimic production. Do we still want to maintain that? I intend to submit a PR reverting perform_caching back to false.

waiting-for-dev commented 3 years ago

Awesome catch @gsmendoza! I really enjoyed reading your criminal investigation report. Congrats! :smile:

@waiting-for-dev In the commit's message, you wrote that action_controller.perform_caching was set to true to mimic production. Do we still want to maintain that? I intend to submit a PR reverting perform_caching to false.

I think we should remove it. I added it because I thought that the original intention was to cache on the test environment, as the cache-store is set to :memory_store However, I'm not sure whether the issue will be fixed by setting perform_caching to false, is it possible that you need to set the :memory_store to :null_store instead?.

gsmendoza commented 3 years ago

Thanks @waiting-for-dev . Now I have this imagination of a real-life detective placing puts statements all over a crime scene :D

About your question in Slack, I'm confident that perform_caching toggles fragment caching. I ran the spec with the debug statements here (more puts!). With perform_caching enabled:

frontend 18:13:32 $ bundle exec rspec spec/features/products_spec.rb:201
Run options: include {:locations=>{"./spec/features/products_spec.rb"=>[201]}}

Randomized with seed 19247
ActionController::Base.perform_caching: true
<<<
cache_key_for_products: en/USD/spree/products/all--20210625101339-9
@products.map(&:updated_at).uniq: ["2021-06-25 10:13:39 UTC"]
>>>
<<<
rendering spree/shared/products...
>>>
<<<
cache_key_for_products: en/USD/spree/products/all--20210625101339-9
@products.map(&:updated_at).uniq: ["2021-06-25 10:13:39 UTC"]
>>>
F  HTML screenshot: /home/gsmendoza/repos/solidus/frontend/spec/dummy/tmp/capybara/screenshot_2021-06-25-18-13-39.674.html

Failures:

  1) Visiting Products a product with variants, images only for the variants should not display no image available
     Failure/Error: expect(page).to have_xpath("//img[contains(@src,'blank')]")
       expected to find xpath "//img[contains(@src,'blank')]" but there were no matches
     # ./spec/features/products_spec.rb:204:in `block (3 levels) in <top (required)>'

Notice that since cache_key_for_products was the same, the products partial was rendered only once.

With perform_caching turned off, products partial is always rendered even if cache_key_for_products doesn't change:

frontend 18:12:25 $ bundle exec rspec spec/features/products_spec.rb:201
Run options: include {:locations=>{"./spec/features/products_spec.rb"=>[201]}}

Randomized with seed 62202
ActionController::Base.perform_caching: false
<<<
cache_key_for_products: en/USD/spree/products/all--20210625101252-9
@products.map(&:updated_at).uniq: ["2021-06-25 10:12:51 UTC", "2021-06-25 10:12:52 UTC"]
>>>
<<<
rendering spree/shared/products...
>>>
<<<
cache_key_for_products: en/USD/spree/products/all--20210625101252-9
@products.map(&:updated_at).uniq: ["2021-06-25 10:12:51 UTC", "2021-06-25 10:12:52 UTC"]
>>>
<<<
rendering spree/shared/products...
>>>
.

Finally, based on the Rails caching guide, it seems that perform_caching applies to all basic Rails caching techniques:

image

gsmendoza commented 3 years ago

is it possible that you need to set the :memory_store to :null_store instead?

@waiting-for-dev I think we need to keep it at :memory_store since there are some specs that are tested with caching enabled. They're tagged with caching: true. Please see for example, frontend/spec/features/caching/products_spec.rb.

waiting-for-dev commented 3 years ago

Cool, let's go for it then :muscle: Thanks for checking!

kennyadsl commented 3 years ago

@gsmendoza 👍 nice catch! Let's go with re-introducing cache: false in the Dummy!