ratt-ru / QuartiCal

CubiCal, but with greater power.
MIT License
8 stars 4 forks source link

Possible distributed graph serialisation issue #44

Closed sjperkins closed 3 years ago

sjperkins commented 3 years ago

I'm seeing lots these sorts of error at the CHPC when the graph is sent to the scheduler, but I'm not sure if its because I've done something incorrectly:

distributed.comm.utils - ERROR - ('Could not serialize object of type tuple.', '(<built-in function getitem>, (<function apply at 0x2aaab2ff6f80>, <function estimate_noise_kernel at 0x2aaad2fc35f0>, [], (<class \'dict\'>, [[\'data\', "(\'sub-07574af27f3f00ce36d4f38e56f7401d\', 0, 0, 0)"], [\'flags\', "(\'bflags-bb1b501df89643688a177b2aad22984d\', 0, 0, 0)"], [\'a1\', "(\'read~ANTENNA1~[28,0,0]-1557766852_sdp_l0-CGCG044_046-corr-4k-nopolcal.ms-fe8051a371a1b0644afa9749bd6b8af2\', 0)"], [\'a2\', "(\'read~ANTENNA2~[28,0,0]-1557766852_sdp_l0-CGCG044_046-corr-4k-nopolcal.ms-d24a5763ccc8697f8667516d958d54b7\', 0)"], [\'n_ant\', 60]])), \'ivpc\')')

This may have only been exposed because this is a proper cluster (as opposed to a LocalCluster). Does it look familiar?

JSKenyon commented 3 years ago

This could be something to do with custom graph construction. No idea when I will have a chance to dig into it though.

On Fri, 6 Nov 2020, 17:56 Simon Perkins, notifications@github.com wrote:

I'm seeing lots these sorts of error at the CHPC when the graph is sent to the scheduler, but I'm not sure if its because I've done something incorrectly:

distributed.comm.utils - ERROR - ('Could not serialize object of type tuple.', '(, (<function apply at 0x2aaab2ff6f80>, <function estimate_noise_kernel at 0x2aaad2fc35f0>, [], (<class \'dict\'>, [[\'data\', "(\'sub-07574af27f3f00ce36d4f38e56f7401d\', 0, 0, 0)"], [\'flags\', "(\'bflags-bb1b501df89643688a177b2aad22984d\', 0, 0, 0)"], [\'a1\', "(\'read~ANTENNA1~[28,0,0]-1557766852_sdp_l0-CGCG044_046-corr-4k-nopolcal.ms-fe8051a371a1b0644afa9749bd6b8af2\', 0)"], [\'a2\', "(\'read~ANTENNA2~[28,0,0]-1557766852_sdp_l0-CGCG044_046-corr-4k-nopolcal.ms-d24a5763ccc8697f8667516d958d54b7\', 0)"], [\'n_ant\', 60]])), \'ivpc\')')

This may have only been exposed because this is a proper cluster (as opposed to a LocalCluster). Does it look familiar?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/JSKenyon/QuartiCal/issues/44, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSHDWMNATENJHG44BOKZF3SOQMCNANCNFSM4TM25KKA .

sjperkins commented 3 years ago

I've attached the full output log

output.txt