Open coleshaw opened 1 year ago
Examining the service in Portainer shows CPU pegged at 100%, though it's unclear if that's 100% of 1 CPU or 100% of the maximum container resource specification...
Note that running the task manually in a Portainer console to Metis at least generates in the logs a message that does not appear in Airflow (so the process seems to get further):
Found 11 data blocks.
11 images require thumbnails to be generated.
Though it also seems to be stuck...wonder if there are large files that VIPs isn't able to handle?
Doing some investigation, it seems stuck on a fastq.gz file, that is only 205594
bytes in size ... Trying to create the thumbnail manually in IRB with the same Ruby code, as well as using the command-line tool vipsthumbnail
also both hang, so seems to be some interaction between this file and VIPS? But other fastq's should have worked, too, this file was uploaded back in 2022-03, so it's not like the first one of its type...
The next fastq in the queue works fine using vipsthumbnail
on the command line, so I'm going to manually flag this one has has_thumbnail: false
to unstick the ETL ... but definitely something to be aware of.
For future reference, the stuck block was 123966a0397c2d27256e88bac363ec1d
, so perhaps someone will be able to probe deeper into why this specific file caused issues with VIPS...
This task has been stuck for a couple of weeks in Airflow (since 1/31), and it's not clear why ... need to do further investigation. There are no error messages, and it seems to start up okay. I have increased memory and CPU for the DAG, but neither of those helped.