datacite / lupo

DataCite REST API
https://api.datacite.org
MIT License
11 stars 8 forks source link

Identify Tests taking the most time #1241

Closed jrhoads closed 1 week ago

jrhoads commented 3 weeks ago

Taking a look at some of our most recent github action runs we can see that groups 0 and 1 of the parallel spec groups tend to run much longer at ~17 and ~13 minutes respectively compared to the other 8 spec groups which come in at around 6 minutes on average.

So let's focus on group 0 for now and potentally get back to group 1 later.

We can run a profile on group 0

bundle exec parallel_test spec/ -n 10 --only-group 0 --type rspec -o "--fail-fast --profile 10"

jrhoads commented 2 weeks ago

Running a profile on group 0 shows in a local dev environment shows the following

Top 10 slowest example groups: ProviderPrefixesController 9.16 seconds average (119.13 seconds / 13 examples) ./spec/requests/provider_prefixes_spec.rb:5 Providers 7.76 seconds average (85.31 seconds / 11 examples) ./spec/concerns/countable_spec.rb:5 EventImportByIdJob 4.93 seconds average (4.93 seconds / 1 example) ./spec/jobs/event_import_by_id_job_spec.rb:5 DataciteDoisController 4.84 seconds average (4.84 seconds / 1 example) ./spec/requests/datacite_dois_gzip_spec.rb:5 User 4.84 seconds average (1524.22 seconds / 315 examples) ./spec/models/ability_spec.rb:6 UrlJob 4.53 seconds average (4.53 seconds / 1 example) ./spec/jobs/url_job_spec.rb:5 User 4.36 seconds average (148.32 seconds / 34 examples) ./spec/concerns/authenticable_spec.rb:5 DataciteDoi 3.31 seconds average (9.94 seconds / 3 examples) ./spec/models/datacite_doi_spec.rb:5 WorksController 3.07 seconds average (18.41 seconds / 6 examples) ./spec/requests/works_spec.rb:5 PrefixType 2.65 seconds average (5.3 seconds / 2 examples) ./spec/graphql/types/prefix_type_spec.rb:5 Finished in 35 minutes 13 seconds (files took 15.34 seconds to load)

While the highest average time is the ProviderPrefixesController, the largest clock time at 1524.22 seconds is the ability_spec.rb

We should focus on the ability_spec to see if there are any places where we can save time in this spec.

jrhoads commented 2 weeks ago

Group 1 from above here only runs 1 spec spec/requests/datacite_dois_spec.rb (although the group numbering can change change as the length of files changes). At around 14 minutes per run.

Currently the grouping of specs is dependent on the total length of each file. spec/requests/datacite_dois_spec.rb has ~5k lines of code in it. It also has several sleep statements.

The set of work to focus on is spec/requests/datacite_dois_spec.rb

jrhoads commented 1 week ago

Closing this for now. After working on these two groups of tests and increasing the number of parallel runners, the total time for the tests is 7 and change.