Closed jclark1913 closed 1 year ago
Indeed this is expected for a list of async tasks where the first to reach a certain point dictates the last/first values, there's no order guarantee.
One option is to do the data gathering in async tasks, as it is now but then collate the result from the list of task results. Another idea is, if this is only an issue for the timestamps, an extra comparison of the timestamp with min/max
or </>
.
Overview
On occasion, the tool returns incorrect start/end dates for codes. An example:
Last seen is after first seen in some places, and there isn't really a logical progression from code to code.
Cause and possible solution
I think the main culprit here is that the
results
dictionary is being updated asynchronously. Since we're callingasyncio.gather()
on a collection of tasks, it isn't moving orderly and from timestamp to timestamp chronologically. Instead, a bunch of async calls are taking place and whoever finishes first sets "first_seen".This could be solved by adding some extra logic that assumes that the dates will be processed out of order and updates first/last seen based on the value of the date being processed.