fabric-testbed / fabfed

FABRIC Tool-based Federation Kit for a Testbed of Testbeds
MIT License
3 stars 0 forks source link

DX connection name unfound by SENSE-AWS after successfully FABRIC slice creation #147

Open xi-yang opened 3 months ago

xi-yang commented 3 months ago

In a FABRIC+SENSE-AWS workflow session, after the FABRIC slice was successfully and DX connection ordered, SENSE-AWS threw an error

2024-08-22 11:16:57,968 [controller.py:303] [ERROR] Returned code 500 with error '{'errorCodeName': 'GENERIC', 'message': "MCE_AwsDxStitching-doStitching-5a904b6f-06ec-41a0-979d-95fd2a5d7cbd - cannot find DX resource for 'ppg-show-demo3-296256999979': must be uri or name of a Hosted Connection"}'

However, the connection named 'ppg-show-demo3-296256999979' did show up on the AWS console. Delete the SENSE instance and re-apply worked. So it is likely a timing issue.

abessiari commented 3 months ago

Strange. Did you have to delete the sense instance? I mean was it necessary? Or a sense retry would have worked?

xi-yang commented 3 months ago

Strange. Did you have to delete the sense instance? I mean was it necessary? Or a sense retry would have worked?

Not sure if that was necessary. I just did it. SENSE instance failed to compile so no retry allowed at that stage. The timing is the strange part.

abessiari commented 3 months ago

Yes the timing is strange as we do wait for the fabric slice to StableOk. You know on the fabric side ordering a connection takes about one minute or so. Creating the node takes much longer .... Is there an exception on the sense side of things? Anyways I will keep an eye on it.

xi-yang commented 3 months ago

Yes the timing is strange as we do wait for the fabric slice to StableOk. You know on the fabric side ordering a connection takes about one minute or so. Creating the node takes much longer .... Is there an exception on the sense side of things? Anyways I will keep an eye on it.

SENSE-O raised an exception because the computation failed to find the named dx connection give in the service profile. So it ended up in INIT-FAILED. No delta generated.

abessiari commented 3 months ago

Ok I will give it a try when I come back from son's doctor appointment. and update you.

BTW the fabfed logs show the intent that is sent to sense ... the name of the dx connection should be in there. if you could attach the logs I can dig further.

xi-yang commented 3 months ago

I only saw that problem once. So likely a rare case to reproduce. We may just stay put and keep an eye on this.

We could add some delay (like 30 seconds) if we want to play safe. Otherwise, this is more of an issue for Al2S or Al2S-AM.