Geontech / meta-redhawk-sdr

REDHAWK SDR Layer for Yocto/OpenEmbedded -based deployments
http://geontech.com/getting-started-with-meta-redhawk-sdr/
GNU Lesser General Public License v3.0
9 stars 6 forks source link

Failed to satisfy device dependencies for component - Waveform loading on Taget board #63

Closed NayanaAnand closed 3 years ago

NayanaAnand commented 3 years ago

Hi,

I have the meta-redhawk-sdr(Branch : Pyro) with Yocto BSP am57xx and loaded image on Target board.

Loaded image has all the Redhawk components and basic Waveform. Now trying to launch one of the available basic waveform from the image loaded on Taget board(Phytec).

Procedure or Fallowed steps:

Target board connected to Desktop server via serial cable and VLAN also available.
Started Domain and Device manager with naming services on Target board.
Connected to sandbox from python script as shown below :

[redhawk@2a23bdd8bb78 ~]$ python Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37) [GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2 Type "help", "copyright", "credits" or "license" for more information.

        from ossie.utils import redhawk,sb

        import time

        dom = redhawk.attach("REDHAWK_DEV")

        app = createApplication("/waveforms/rh/basic_components_demo/basic_components_demo.sad.xml")
        Traceback (most recent call last):

File "", line 1, in NameError: name 'createApplication' is not defined

        app = dom.createApplication("/waveforms/rh/basic_components_demo/basic_components_demo.sad.xml")
        Traceback (most recent call last):

File "", line 1, in File "/usr/local/redhawk/core/lib/python/ossie/utils/redhawk/core.py", line 1794, in createApplication app = app_factory.create(name, initConfiguration, []) File "/usr/local/redhawk/core/lib/python/ossie/cf/cf_idl.py", line 2026, in create return _omnipy.invoke(self, "create", _0_CF.ApplicationFactory._d_create, args)

ossie.cf.CF.CreateApplicationError: CF.ApplicationFactory.CreateApplicationError(errorNumber=CF_ENOSPC, msg="Failed to satisfy device dependencies for component: 'rh.SigGen' with component id: 'SigGen_sine:rh.basic_components_demo_339_120418450_1'")

Understanding from source:

As seen in the source or *.spd.xml file, observed PROCESSOR name is playing major role with respect to Target boards(cross compiled OS).

My requirement for Target board is MACHINE_NAME=am57xx-phycore-rdk and PROCESSOR=armv7l

I have mentioned both variables in conf/local.conf.

But few of the components only showing armv7l implementation like dsp, rbdsdecoder , device and domain manager spd.xml files.

All other components are with armv7ahf-neon processor implementation.

Can any one suggest how to change processor name from armv7ahf-neon to armv7l during the building image

If my observations are wrong as per above Error: can you suggest the solution for Failed to satisfy device dependencies for component.

Regards, Nayana

btgoodwin commented 3 years ago

You can try setting REDHAWK_PROCESSOR in the local.conf, for example (or your machine.conf) to override the default guess at architecture. There are a variety of places where this value is patched into place so that responses from things like processor.machine will be guaranteed to match.

Please keep in mind though that pyro is a very old release and only lightly tested. If you would like some visibility in what I've been trying to do, please see dunfell-next. I'm slowly rolling derived patches from my attempts at the start of the year with zeus-next and warrior-next, which were squashes unfortunately, making them a bit harder to trace through the various related changes.

On dunfell-next, I have some basic OEQA testing enabled, and I've had success with the components only on aarch64 in QEMU. I've tried ARM 32 and x86 64 -- both of these have their own strange issues, with the former's being that 5 of the components crash out of their service functions with a positive overflow exception likely coming from boost. If you run into this and find a patch, please feel free to submit pull requests.

NayanaAnand commented 3 years ago

Hi Thomas,

I have given REDHAWK_PROCESSOR="armv7l" in the local.conf, but it is not updating in redhawk components. Is there any other way to update processor name in the redhawk components while building yocto?

And I have tried with above all 3 branches, getting some strange Issues.

Thanks, Nayana

btgoodwin commented 3 years ago

It sounds like bitbake isn't picking up on the sensitivity of dynamic_arch_patch (redhawk-entity.bbclass) to REDHAWK_PROCESSOR and thus isn't invalidating the shared state cache for each of those projects when you re-ran the build. You'll probably have to re-run the cleansstate task for each of the components, softpkgs, devices, and redhawk itself to pull in changes to that variable. I think making use of vardeps on the task was one way to make this explicit, if you're interested in exploring a fix that can be PR'd against the pyro branch.

As for issues on those 3 branches -- just to be clear I was referring to strange run-time issues, not build-time. The build for each of those has been solid for me on CentOS 7 in that it completes successfully and runs through the OEQA tests (with runtime errors for certain components).

NayanaAnand commented 3 years ago

Can you please let me know from where Redhawk components are taking processor_name as "armv7ahf-neon" in spd.xml files, even though redhawk default processor is "armv7l" and I am also giving REDHAWK_PROCESSOR as "armv7l" in conf/local.conf.

Note : Redhawk devices/nodes like GPP, dsp, DomainManager.spd.xml and DeviceManager.spd.xml has processor_name as "armv7l".

Thanks, Nayana

btgoodwin commented 3 years ago

It comes from a number of places, unfortunately. Much of this is driven by the SCA 2.2.x spec.

The core framework uses a combination of C++ libraries and python ones to read from the environment what the processor architecture is, and then tries to match that against the SPD file. Depending on the library and system call, and what hardware, the results vary. And the core framework's autotools configuration makes no use of target architecture in configuring the resulting packages -- it's "on rails" to do x86_64, period. This led to updating so many patches on every release that eventually I switched to dynamic patching (regular expressions in repeatable bitbake tasks).

The big one of course is this REDHAWK_PROCESSOR one, where we're directly patching the source code for the Domain Manager, Device Manager, and sandbox to always return some expected result by clobbering it with a default value. For device managers generated from devices (like gpp_setup), I began migrating us to running the related scripts as post-install functions on the target because those scripts also must read from the target to properly configure limits, etc. -- things we could never know for all possible build configurations.

I eventually chose _PACKAGEARCH as the default because in testing, it was most consistently "close" to what was returned from the kernel's un.machine, for example, but again there was no guarantee even that result would match the other ways this value was checked in the source code (hence the patch-n-clobber approach above). We then also have a utility script that dynamically patches the installed SPDs to match. This is why the newer releases, one generally shouldn't have to mess with the REDHAWK_PROCESSOR value at all. It should all "just work."

This group of changes came after creating that pyro integration branch and in-fact many of these changes have not been migrated downwards to all the older releases (we've had no programmatic need to do so).

At this point I'm the only maintainer of the project as well, so when I do have spare time, I tend to focus on sniff-testing leading-edge compatibility with back-porting being a secondary concern.

This being a FOSS, community-supported project, we have accepted and do accept pull request contributions. If you do make a back-port to pyro (like pull dunfell-next and attempt back-porting it to pyro compatibility), please run the OEQA tests for a few qemu targets (arm, aarch64 to name two).

btgoodwin commented 3 years ago

I should also point out that we've found these patches, though successful at build and run-time, are not great in the poky SDK environment. The only real "fix" to making this all "better" is to spend considerable time upstreaming related patches into the core framework and its assets to better support cross-compiling, and then over the course of a few releases, let all the assets be regenerated to support those features and remove the related patching in this layer. However upstreaming those changes would imply that organization would be responsible for maintaining that code by proving through tests that multiple architectures are supported, and presently, thats not likely to go well.

NayanaAnand commented 3 years ago

Hi Thomas,

I have clean the each components and run the build has per the suggestion and loaded waveform on the Target board Phytec.

re-run the cleansstate task for each of the components, softpkgs, devices, and redhawk itself to pull in changes to that variable. I think making use of vardeps on the task was one way to make this explicit, if you're interested in exploring a fix that can be PR'd against the pyro branch.

Build was successful with processor name as expected "armv7l". But still i am facing an same issue while launching the waveform on the phytec.

Procedure as below: Hardware Phytec terminal - Refer the screen shot attached . Waveform Error.odt.zip

Can you guide me where am i going wrong or is it something missing ?

Regards, Nayana

btgoodwin commented 3 years ago

Hmm, this is sounding a lot like an error I found when I began using the spd_utility to "patch in" additional implementations. It impacted those two components, specifically, because they are some of the few that have alternate implementations (Java and Python). The issue in that case was we would install an SPD listing say, 4 implementations (cpp, python, java, cpp_armv7l) but only load the binary for the last one. The application factory in that case would fail to match cpp's processor_name needs and move on to python. It would fail there because the entrypoint wasn't installed on the target, at which point the application factory would abort launch rather than proceed to the next implementation. I opened an issue for this recently; my ~fix~ work-around in the mean time is related to our spd_utility which now deletes all implementations from the SPD XML that will not be installed.

Have you tried running your Domain with the debug level turned up to debug or trace when launching the app? It would be nice to see what decisions are driving the application factory to fail in this case. (The log is going to be very long; its worth redirecting to a file when running nodeBooter like this.)

Something else to check -- the GPP property listing in its device manager should have processor_name property ID (a UUID) set to armv7l or whatever matches what is in your component SPDs (the value may change at run-time according to the properties API, but because the SCA interpretation, the run-time value "doesn't matter" -- only whats in the XML does).

btgoodwin commented 3 years ago

I just want to further point out - the SPD utility patch I made to remove the other implementation references wouldn't matter in the case of Pyro. On this branch, we replace processor_name of the cpp implementation, so if that value doesn't match the GPP PRF.xml value (or DCD.xml override for processor_name on the GPP device), then you'll see a failure about not finding executable devices.

NayanaAnand commented 3 years ago

Hi Thomas,

Can you check the reference log and guide me where i am going wrong?

Regards, Nayana

btgoodwin commented 3 years ago

Unfortunately I don't see anything helpful off-hand in the log, no place where it mentions failing to launch rh.SigGen. I see a device manager starting however, so I'm not sure I understand the last bullet. One I don't recognize is line 2113 about being unable to load the libossielogcfg.so, but I doubt it's related. The USAGE STATE messages are a little suspicious though. Have you tried checking the GPP's usage state from the python sandbox or IDE while the system is apparently at idle?

At the start of the year, I had an aarch64 system built on Thud that refused to get out of BUSY. The problem I ultimately could trace down to optimization being turned on (-O3) and a boost regex (' \t') that was failing to parse the output of /proc/sys (if I recall correctly). Because of this it stayed 'busy' all the time. In your log, I'm seeing cpu threshold=10 measured <much bigger than 10> almost the entire time.

Another fix I've implemented in more recent branches is to make the GPP package RDEPENDS on the procps package so that I can eliminate this patch. (This is more of an FYI, I doubt it's related to this, but it might be since this is related to how the GPP tracks tasks, which might be the reason for the child died... messages lacking a PID.)

NayanaAnand commented 3 years ago

Hi Thomas,

Can you tell me what all scenario the above error can occur.

Regards, Nayana

btgoodwin commented 3 years ago

That's what I'm trying to help identify -- why it failed to satisfy device dependencies. That is a general error stating that there were no Executable Devices in the Domain able to take the allocations necessary or follow through with the load and execute requests for one of a particular component's available implementations. One of the possibilities was processor_name mismatch, which you've now addressed. Another possibility we are trying to look at is that even though an Executable Device is running with a matching processor_name, it might not be in a state where it can accept any more load or execute requests (usageState BUSY will cause this). I'm suggesting is to run the device manager and domain in whatever way where they're stable enough for you to check the usage state of the GPP once it's up.

I apologize too if my usage of "python sandbox" was confusing. I take for granted how common it is for us to say that to generically mean the python shell (ossie.utils redhawk or sb, not just sb).

NayanaAnand commented 3 years ago

Hi Thomas,

Regards, Nayana

NayanaAnand commented 3 years ago

Hi Thomos,

Any update on above issue.

Thanks, Lakshminaidu

btgoodwin commented 3 years ago

The Active state should happen when the device is allocated but not out of capacity (Busy).

As for re-running this build and debugging it, no, I apologize I don't have the spare time to do that right now.

Have you tried doing any of this using a qemu MACHINE target, to see if you can run it in emulation on a similar CPU architecture? I've sometimes found that to be a helpful way to go since I don't have to spend time updating SD cards, etc.

NayanaAnand commented 3 years ago

Ok Thomas. Thanks for the update and your valuable time.

NayanaAnand commented 3 years ago

Hi Thomas,

Thank you for the support.

The issue "Failed to satisfy device dependency for component" is resolved. Solution : As suggested by you, enabled log level to TRACE and i have seen processor match failed due to "processor_value" is missed in GPP.prf.xml file.

I am able to launch waveform by creating Application with same python script posted earlier.

Now i want to interact between the Desktop server and Target Hardware via Device manager. is there any simple ways to do that?

If possible, can you suggest me the ways to Send and Receive data between Server and Target Hardware.

Thanks and Regards, Nayana

btgoodwin commented 3 years ago

I need more information, I think. Based on our discussions, I think you have the IDE running on x86_64 (i.e., the server) and have managed to connect it to the target platform which is already running a Device Manager, streaming data into a waveform that is running on the target, is that correct? If so, streaming data from the target waveform to the server should be as simple as right-clicking on the port and choosing connect, then selecting the destination. To do this from the python shell, you would need to use the connect API on the component (or waveform, if the port is marked as external).

NayanaAnand commented 3 years ago

Hi Thomas,

Test Setup: Target Board(Phytec hardware) is connected to Desktop server via serial cable and both Hardware and Desktop is connected to same VLAN network.

Target board is loaded with YOCTO image (contains Basic components, GPP device , Naming services, Domain, Device Manager and Waveforms)

Please find the procedure that is fallowed to load the waveform on Target board:

  1. On the Target board(Phytec Hardware) started Naming services with Domain and Device manager using nodeBooter. Command : nodeBooter -D -d /var/redhawk-sdr/sdr/dev/nodes/DevMgr-GPP/DeviceManager.dcd.xml

  2. On the Desktop server, started python session by connecting to 'REDHAWK_DEV' and loaded waveform as shown below. Python session: from ossie.utils import redhawk, sb dom = redhawk.attach('REDHAWK_DEV') app = dom.createApplication('/waveforms/rh/socket_loopback_demo/socket_loopback_demo.sad.xml')

Note : Currently i don't have IDE in the yocto build and so can't launch the plot in Target hardware.

Regards, Nayana

btgoodwin commented 3 years ago

We don't have a package for sticking the IDE onto the target for a variety of reasons, but most typically, the target hardware we deal with do not have video output capabilities or if they do, being able to run the IDE locally to the target isn't essential since pointing the x86_64 IDE at the target's naming service IP is generally enough to monitor the target.

On the Desktop server, is the /etc/omniORB.cfg configured to point at the target's IP address (i.e., to use the target's naming service instance)? If not, you can pass it as an argument to attach:

# at the x86_64 python shell
dom = redhawk.attach('REDHAWK_DEV', location='A.B.C.D')

This will ensure the resulting domain you're talking to is the target's, not the identically named default domain on the server where the IDE is running.

Similarly in the IDE, when you go to connect to the Domain, it will have a place to enter a new address rather than the default localhost. You would put that A.B.C.D target IP address there.

NayanaAnand commented 3 years ago

Hi Thomas,

Thank you for the support, now i am able to launch waveform on Target hardware from Server.

Regards, Nayana

btgoodwin commented 3 years ago

Excellent! You're welcome. Thank you for letting me know.