We split WPT into a number of segments to support parallelism and to limit the impact of fatal errors (it's much easier to recover a failed build for 1/20th of WPT than a failed build for all of WPT). To date, we have defined these segments using only the --this-chunk and --total-chunks arguments to the WPT CLI. As a result, every segment is likely to include some of every test "type" supported by WPT (currently, that's "reftest", "testharness", and "wdspec").
This is slightly inefficient because there is a delay when switching between test types. The delay is short (scarcely longer than the time required to restart the browser), and infrequent (tests are run in groups according to their type), so it's not a concern in the scheme of things.
However, this approach does intensify the impact of certain kinds of infrastructure errors. We occasionally experience errors that are specific to a certain test type. When tests of every type are dispersed across all segments, these errors can interfere with every segment. I don't think we've seen that yet, but for intermittent errors, this dispersion increases the likelihood of partial failure for a given collection attempt.
The WPT CLI also offers a argument for filtering by test type (it's named --test-types, as fate would have it). This project should use that when defining segments in order to limit the impact of these types of errors. (Good news: WPT's TaskCluster integration already takes this approach.)
We split WPT into a number of segments to support parallelism and to limit the impact of fatal errors (it's much easier to recover a failed build for 1/20th of WPT than a failed build for all of WPT). To date, we have defined these segments using only the
--this-chunk
and--total-chunks
arguments to the WPT CLI. As a result, every segment is likely to include some of every test "type" supported by WPT (currently, that's "reftest", "testharness", and "wdspec").This is slightly inefficient because there is a delay when switching between test types. The delay is short (scarcely longer than the time required to restart the browser), and infrequent (tests are run in groups according to their type), so it's not a concern in the scheme of things.
However, this approach does intensify the impact of certain kinds of infrastructure errors. We occasionally experience errors that are specific to a certain test type. When tests of every type are dispersed across all segments, these errors can interfere with every segment. I don't think we've seen that yet, but for intermittent errors, this dispersion increases the likelihood of partial failure for a given collection attempt.
The WPT CLI also offers a argument for filtering by test type (it's named
--test-types
, as fate would have it). This project should use that when defining segments in order to limit the impact of these types of errors. (Good news: WPT's TaskCluster integration already takes this approach.)