Since the tracking thread shares the same process as the main thread, inaccuracies are introduced, including the tracking thread hanging during critical computations in the main thread.
We need to replace the threading logic with corresponding process logic.
The process can still make use of an Event which, when set, can stop the while loop.
In case of an unrecoverable error, we can store the resource usages in a pickle file (.<random-uuid>.pkl) every iteration.
If the join fails (after all the retries), we need to forcibly terminate the process. Either way, we close the process after either joining or terminating.
We no longer need to kill the program if the join fails since we're reading from the pickle file. Instead, the warning can say that the process failed to stop normally and we had to forcibly kill it.
We should compare the current time to the time that the file was saved. If they differ by more than _USAGE_FILE_TIME_DIFFERENCE = 10 seconds (class variable), then we should log a warning informing the user that the log file time stamp differs from the tracking stop time by XXX seconds.
Then we remove the pickle file.
Add to documentation that the user may want to allocate an additional core for their job designated to the tracking process to guarantee maximally accurate tracking.
This will result in a release version of 3.0.0 and should coincide with the release procedure of the current release branch along with any needed changes to documentation.
Since the tracking thread shares the same process as the main thread, inaccuracies are introduced, including the tracking thread hanging during critical computations in the main thread.
Event
which, when set, can stop the while loop..<random-uuid>.pkl
) every iteration._USAGE_FILE_TIME_DIFFERENCE = 10
seconds (class variable), then we should log a warning informing the user that the log file time stamp differs from the tracking stop time by XXX seconds.release
branch along with any needed changes to documentation.