canonical / testflinger

https://testflinger.readthedocs.io/en/latest/
GNU General Public License v3.0
9 stars 13 forks source link

Some TF job with large image as provision_data url are failing with timeout error #297

Closed rmartin013 closed 4 days ago

rmartin013 commented 4 days ago

Some jobs are failing with a 5GB image provided in the provision_data section, that error is quite new. There used to be a 30min timeout in the past, it seems reducing this timeout to 20min can trigger this issue with some Riverside devices See http://10.102.156.15:8080/job/partner-engineering/job/riverside/job/riverside-jetson-agx-orin-cdimage-daily/273/console

syncronize-issues-to-jira[bot] commented 4 days ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/CERTTF-347.

This message was autogenerated

plars commented 4 days ago

Based on my understanding, https://github.com/canonical/testflinger/pull/298 should fix this, but @rmartin013 can you confirm that 30 min. is sufficient for these devices to write to their installation media?

rmartin013 commented 4 days ago

Based on my understanding, #298 should fix this, but @rmartin013 can you confirm that 30 min. is sufficient for these devices to write to their installation media?

Yes I think it should. Funny thing, I created the exact same commit, probably around the same time ^^, but I am fine with yours

plars commented 4 days ago

Based on my understanding, #298 should fix this, but @rmartin013 can you confirm that 30 min. is sufficient for these devices to write to their installation media?

Yes I think it should. Funny thing, I created the exact same commit, probably around the same time ^^, but I am fine with yours

I looked and didn't see yours so I pushed a quick fix. But it actually works out a lot better if we use yours since I can approve yours :) I'll remove this one, and thanks for the fixes!