Cloud-CV / EvalAI

:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
https://eval.ai
Other
1.77k stars 785 forks source link

Fix static-code upload challenge - pod exits after submission container runs #4266

Open gchhablani opened 9 months ago

gchhablani commented 9 months ago

The current system experiences abrupt crashes in the pod, specifically when the submission.csv file is being created within the submission container. The creation time of submission.csv varies based on participant or host team submissions, leading to synchronization issues with the monitoring container.

As a temporary solution, hosts are manually adding a sleep command after submission file creation to allow the monitoring container sufficient time to pick up the file. However, this approach is considered hacky and not a sustainable solution.

Expected Behavior: The system should handle the creation of the submission.csv file more robustly, ensuring that the monitoring container doesn't encounter issues or crashes when the file becomes available. A more elegant and automated solution is sought to replace the current workaround.

Current Behavior: Abrupt crashes in the pod occur when the submission.csv file is being created. The monitoring container is killed, likely due to the file not being immediately available for processing.

We need to investigate and implement a cleaner solution to handle the timing discrepancies between the submission file creation and the monitoring container's activity. This may involve enhancing the synchronization mechanism or introducing a more robust way for the monitoring container to detect the availability of submission.csv.

poju3185 commented 8 months ago

Hi @shivas1516, are you still working on this issue?

shivas1516 commented 8 months ago

No, Feel free to assign this issue to you

ranjanmangla1 commented 7 months ago

Hi @poju3185 , if you are not working on this issue. I would love to take it