The data extract for the full Coronavirus dataset seems to have gotten hung up sometime after March 25, probably either when the shared /storage drive ran out of space, or when the server had to be restarted after a network outage. TweetSets read the task as still processing, although no files were being produced. In order to restart the task, it's necessary to delete the pertinent folder in /storage/full_datasets.
We need a way to recover gracefully from such errors.
If we continue using Celery, look at the call to _generate_tasks.AsyncResult(task_id), which was returning a "Pending" status even in the absence of a viable task.
If we are able to use Spark for extracts, consider exposing the Spark jobs UI from the container (for monitoring and disabling of jobs).
The data extract for the full Coronavirus dataset seems to have gotten hung up sometime after March 25, probably either when the shared
/storage
drive ran out of space, or when the server had to be restarted after a network outage. TweetSets read the task as still processing, although no files were being produced. In order to restart the task, it's necessary to delete the pertinent folder in/storage/full_datasets
.We need a way to recover gracefully from such errors.
If we continue using Celery, look at the call to
_generate_tasks.AsyncResult(task_id)
, which was returning a "Pending" status even in the absence of a viable task.If we are able to use Spark for extracts, consider exposing the Spark jobs UI from the container (for monitoring and disabling of jobs).