mountetna / monoetna

mono-repository version of etna projects
GNU General Public License v2.0
1 stars 0 forks source link

Metis: generate_thumbnails airflow task is stuck #1210

Open coleshaw opened 1 year ago

coleshaw commented 1 year ago

This task has been stuck for a couple of weeks in Airflow (since 1/31), and it's not clear why ... need to do further investigation. There are no error messages, and it seems to start up okay. I have increased memory and CPU for the DAG, but neither of those helped.

coleshaw commented 1 year ago

Examining the service in Portainer shows CPU pegged at 100%, though it's unclear if that's 100% of 1 CPU or 100% of the maximum container resource specification...

coleshaw commented 1 year ago

Note that running the task manually in a Portainer console to Metis at least generates in the logs a message that does not appear in Airflow (so the process seems to get further):

Found 11 data blocks.
11 images require thumbnails to be generated.

Though it also seems to be stuck...wonder if there are large files that VIPs isn't able to handle?

coleshaw commented 1 year ago

Doing some investigation, it seems stuck on a fastq.gz file, that is only 205594 bytes in size ... Trying to create the thumbnail manually in IRB with the same Ruby code, as well as using the command-line tool vipsthumbnail also both hang, so seems to be some interaction between this file and VIPS? But other fastq's should have worked, too, this file was uploaded back in 2022-03, so it's not like the first one of its type...

coleshaw commented 1 year ago

The next fastq in the queue works fine using vipsthumbnail on the command line, so I'm going to manually flag this one has has_thumbnail: false to unstick the ETL ... but definitely something to be aware of.

coleshaw commented 1 year ago

For future reference, the stuck block was 123966a0397c2d27256e88bac363ec1d, so perhaps someone will be able to probe deeper into why this specific file caused issues with VIPS...