radical-collaboration / CyberManufacturing

CDSE Multi-scale CI Project
1 stars 0 forks source link

RP crashes on the client side but the execution keeps running on the resource #26

Closed csampat closed 6 years ago

csampat commented 6 years ago

I am running the new rp experiments, they get submitted and start to execute on the resource. The client shows that it has crashed and shows the session lifetime, where as they are still executing on the resource

Eg:

--------------------------------------------------------------------------------
gather results                                                                  

wait for 1 unit(s)
        -                                                                     ok
.                                                                             ok
submit 1 unit(s)
        .                                                                     ok

--------------------------------------------------------------------------------
gather results                                                                  

wait for 1 unit(s)
        -                                                                     ok
.                                                                             ok
submit 1 unit(s)
        .                                                                     ok

--------------------------------------------------------------------------------
gather results                                                                  

wait for 1 unit(s)
        -                                                                     ok

.                                                                             ok
submit 1 unit(s)
        .                                                                     ok

--------------------------------------------------------------------------------
gather results                                                                  

wait for 1 unit(s)
        +                                                                     ok

--------------------------------------------------------------------------------
finalize                                                                        

closing session diam200DEM64PBM16_1                                            \
close unit manager                                                            ok
close pilot manager                                                            \
wait for 1 pilot(s)
                                                                         timeout
                                                                              ok
+ diam200DEM64PBM16_1 (json)
+ pilot.0000 (profiles)
+ pilot.0000 (logfiles)
session lifetime: 279.1s                                                      ok
iparask commented 6 years ago

Are you running with --verbose DEBUG? If yes, there should be a folder with the session name. Go in and do grep ERROR *.log

iparask commented 6 years ago

Duplicate of #28. Closing