Closed ajcasagrande closed 1 year ago
I cannot recreate the error by using device-virtual or device-modbus with 2.3.0 or main branch. Could you provide us with the onvif camera setup steps, the profile and the device information?
@ajcasagrande, is this with 2.3.0, or the latest WIP 3.0.0 code?
@lenny-intel facing this issue with Levski
.
@ajcasagrande, is this with 2.3.0, or the latest WIP 3.0.0 code?
Yes this is using levski branch of edgex compose, stock 2.3.0 images (ignore the first usb one):
Full docker compose: docker-compose.yml.txt
Ok, so I have looked into it further, and the issue is being caused due to a timing issue in regards to the actual architecture of EdgeX (device-sdk-go) itself. Since EdgeX only provides an Initialize function and not a Start function, we (device-onvif-camera devs) have placed logic in the Initialize block which will ultimately cause the UpdateDevice to be called.
However, looking at the code: https://github.com/edgexfoundry/device-sdk-go/blob/v2.3.0/pkg/service/init.go#L56-L67
The Initialize function is called before the selfRegister function. Because of this, the dic has not been updated to reflect the current running service, thus resulting in the empty value being returned.
The problem is that I assume this order was chosen to prevent a device service that failed initialization from registering. This is where the connundrum happens. So my suggestions forward is that I will create a feature request to implement a Start callback for the service which is called after the registration and bootstrapping. And then for the current codebase, implement a hacky async sleep to allow the registration to occur before the devices are communicated with.
Edit: Alternatively, a potential fix for the current codebase would be to move these line of code https://github.com/edgexfoundry/device-sdk-go/blob/v2.3.0/pkg/service/service.go#L221-L227 to a new function which is called earlier in the bootstrap process (before driver.Initialize at least).
@ajcasagrande, we will need to back port you hack to the levski code and release a path since this is a blocking bug, correct?
@ajcasagrande, we will need to back port you hack to the levski code and release a path since this is a blocking bug, correct?
Yes, either the hack will need to be applied to the levski release of device-onvif-camera (and possibly device-usb-camera), or the alternative method would need to be patched in levski device-sdk-go.
Edit: Alternatively, a potential fix for the current codebase would be to move these line of code https://github.com/edgexfoundry/device-sdk-go/blob/v2.3.0/pkg/service/service.go#L221-L227 to a new function which is called earlier in the bootstrap process (before driver.Initialize at least).
@ajcasagrande , this seems like a good fix for Levski and then the Start API for Minnesota for better control. Thoughts?
@lenny-intel Yes, I like that fix for Levski as opposed to doing the hack-y sleep method, assuming there are no other issues that arise after patching it.
Moved this to the ONVIF repo for implementing the Start() API for this service
Latest SDK requires the Start API() to be implemented.
@ajcasagrande do you need my team to help fix this issue in 3.0?
@ajcasagrande @presatish are you able to resolve this by the code freeze date (May 10th)?
@cloudxxx8 I think we should be able to. We have prioritized this issue along with others to complete by May 10th.
@cloudxxx8 Sorry, my notifications seem to not be working properly, or I set my outlook rules incorrectly. Yes, we will address this. Bruce has already added the stub for it, and I can implement it after my current refactor PR, since it may cause conflicts if done separately.
š Bug Report
Affected Services [REQUIRED]
device-sdk-go
Is this a regression?
Yes
Description and Minimal Reproduction [REQUIRED]
All edgex services are running normally. Called docker stop. Wait a while. Call docker start . Observe issue where no device commands work due to 404 device not found error, even though the devices clearly exist in core-metadata.
The first error message code is located in
device.Initialize
and it callsfor _, device := range d.sdkService.Devices() {
, which in turn callsreturn cache.Devices().All()
, which leads me to believe the devices do in fact exist in the cache.š„ Exception or Error
š Your Environment
Deployment Environment: docker compose 2.x edgex compose levski branch v2.3.0
EdgeX Version [REQUIRED]: 2.3.0
Anything else relevant?