Deterministic option (fixed seed?) for Monte Carlo results

abyrd commented 3 years ago

Some of the Simpson Desert tests occasionally fail in GH actions test runs. This is probably because they test the closeness of our Monte Carlo results to theoretical results, and there's always some probability that the MC results will be way off. For reproducible testing we could seed all our random number generators but arguably that reduces thoroughness, and in any case small changes to routing could still change the order in which numbers are produced and cause the tests to fail again. Maybe we should just use really high numbers of MC draws on these tests.

abyrd commented 3 years ago

Note that some of the slowness in these tests is due to building histograms at every destination, even though the test only looks at one of them. Removing that behavior would probably speed them up significantly, making it more reasonable to spend more time on more MC draws.

ansoncfit commented 1 year ago

Recent tests, including with the frequency-heavy network of Sao Paulo, prompted me to think about this again. Letting users toggle on deterministic seeding would be straightforward to implement (e.g. using a similar approach to the one in the multi-criteria router, at https://github.com/conveyal/r5/blob/v6.9/src/main/java/com/conveyal/r5/profile/McRaptorSuboptimalPathProfileRouter.java#L120-L123) and would help users resolve a common headache when they are doing scenario comparisons. We could still recommend networks with frequency-based routes be analyzed with fully randomized schedules first, to get a sense of the noise/uncertainty.

abyrd commented 1 year ago

We discussed this again recently. Results are expected to converge on stable values with an adequate number of MC draws. In the context of these stochastic methods, there does not seem to be a legitimate use for fixed seeds. Any use would amount to an illusion of artificial precision and could lead to inadvertent cherry-picking of results.

If expectations for stable results are not met, there are two main explanations:

The number of MC draws is just too low. Trying to simulate a truly frequency based line without adequately sampling the possible departure times for a given (long) headway.
The modeler is assuming constraints on the line that are not included in the scenario. Departure times are assumed to be synchronized to other routes or specific clock times. Scenario should be updated to use exact-times or phasing.

For tests, the solution is probably to increase the number of MC draws until they pass reliably. Test results would remain nondeterministic by nature, but the probability of failure can be lowered until it essentially never happens.

For regular use we should provide guidance on exact-times, phasing, and number of MC draws, and explain clearly in documentation how and why these stabilize results.

ansoncfit commented 1 year ago

If we want to allow increasing the number of MC draws, we should also test and adjust the socket timeout settings (referenced at https://github.com/conveyal/r5/blob/v6.9/src/main/java/com/conveyal/analysis/controllers/BrokerController.java)

conveyal / r5

Deterministic option (fixed seed?) for Monte Carlo results #714