afrl-rq / OpenUxAS

Project for multi-UAV cooperative decision making
Other
52 stars 25 forks source link

WatchTask crashes AMASE with null pointer exception. #3

Closed bobkr closed 3 years ago

bobkr commented 4 years ago

Screen Shot 2020-08-03 at 4 17 27 PM

lhumphrey commented 4 years ago

I just pulled a clean version of UxAS and was not able to duplicate the error. This was fixed a couple of months ago. Are you working with the most recent version of UxAS?

bobkr commented 4 years ago

Yes I just pulled and built yesterday. Running with Ubuntu 20.4 under virtual box on a mac.

From: lhumphrey notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Tuesday, August 4, 2020 at 10:30 AM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Author author@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

I just pulled a clean version of UxAS and was not able to duplicate the error. This was fixed a couple of months ago. Are you working with the most recent version of UxAS?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-668727443, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD2HWL4YCHRQBYKG4LDR7BARLANCNFSM4PTZ4TDA.

lhumphrey commented 4 years ago

Are you using the run-examples script? I'm wondering if even though you're building UxAS from the most recent code, the run-examples script is picking up an older version stored somewhere else. What do you get when you run the command:

date -r $(which uxas)

manthonyaiello commented 4 years ago

I also am not able to reproduce in a clean bootstrap build of OpenUxAS. But it's clear from @bobkr's output that that is the run-example script run from bootstrap. So it should be picking up the just-built uxas. The bootstrap run-example uses anod printenv to put the path to uxas at the head of PATH.

Perhaps run-example should print the full path to the uxas executable? that would help with debugging issues like this.

bobkr commented 4 years ago

I ran from a fresh install of Ubuntu on virtualbox, and installed through the bootstrap quick start instructions.

From: "M. Anthony Aiello" notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Wednesday, August 5, 2020 at 6:50 AM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

I also am not able to reproduce in a clean bootstrap build of OpenUxAS. But it's clear from @bobkrhttps://github.com/bobkr's output that that is the run-example script run from bootstrap. So it should be picking up the just-built uxas. The bootstrap run-example uses anod printenv to put the path to uxas at the head of PATH.

Perhaps run-example should print the full path to the uxas executable? that would help with debugging issues like this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-669204912, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD45UXUEHLBX4WRHA53R7FPQTANCNFSM4PTZ4TDA.

lhumphrey commented 4 years ago

It looks as if the issue is probably with AMASE crashing (java.lang.NullPointerException).

@bobkr, is this error reproducible for you? Is it the only example using run-example that has this behavior? What speed are you running AMASE at? The communication bridge between UxAS and AMASE has some known issues at high speeds, so that is a possibility.

Another possibility is to try to rebuild AMASE from within the bootstrap directory: ./anod build --force amase

bobkr commented 4 years ago

Yes the error is pretty reproduceable. I have had the case where it would run one time after a re-boot, then fail from then on. That is at speed 1. I don’t think speed 5 ever worked.

Other examples have also failed, with the same type of error.

Another error I get in AMASE is some kind of synchronization error. Like swing might be trying to access something that is locked by some other process? I will make a note of what the error is next time.

I will try the re-build and see if there is any change.

From: lhumphrey notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Wednesday, August 5, 2020 at 11:02 AM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

It looks as if the issue is probably with AMASE crashing (java.lang.NullPointerException).

@bobkrhttps://github.com/bobkr, is this error reproducible for you? Is it the only example using run-example that has this behavior? What speed are you running AMASE at? The communication bridge between UxAS and AMASE has some known issues at high speeds, so that is a possibility.

Another possibility is to try to rebuild AMASE from within the bootstrap directory: ./anod build --force amase

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-669345891, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD7BFGZAL5ZVQMKSVELR7GND7ANCNFSM4PTZ4TDA.

bobkr commented 4 years ago

I just re-build, with the same error at 5x speed.

[cid:image001.png@01D66B21.C8A66850] From: lhumphrey notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Wednesday, August 5, 2020 at 11:02 AM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

It looks as if the issue is probably with AMASE crashing (java.lang.NullPointerException).

@bobkrhttps://github.com/bobkr, is this error reproducible for you? Is it the only example using run-example that has this behavior? What speed are you running AMASE at? The communication bridge between UxAS and AMASE has some known issues at high speeds, so that is a possibility.

Another possibility is to try to rebuild AMASE from within the bootstrap directory: ./anod build --force amase

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-669345891, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD7BFGZAL5ZVQMKSVELR7GND7ANCNFSM4PTZ4TDA.

bobkr commented 4 years ago

FYI: I just pulled again, and re-built. Still get the null pointer, even at speed 1, it just goes longer than 5x. Must be a timing issue in the communication bridge being slower in virtualbox.

I will try and get a native system next week and see if is reproduceable there.

From: krivacic krivacic@parc.com Date: Wednesday, August 5, 2020 at 12:13 PM To: afrl-rq/OpenUxAS reply@reply.github.com Cc: Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

I just re-build, with the same error at 5x speed.

[cid:image001.png@01D66B21.C8A66850] From: lhumphrey notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Wednesday, August 5, 2020 at 11:02 AM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

It looks as if the issue is probably with AMASE crashing (java.lang.NullPointerException).

@bobkrhttps://github.com/bobkr, is this error reproducible for you? Is it the only example using run-example that has this behavior? What speed are you running AMASE at? The communication bridge between UxAS and AMASE has some known issues at high speeds, so that is a possibility.

Another possibility is to try to rebuild AMASE from within the bootstrap directory: ./anod build --force amase

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-669345891, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD7BFGZAL5ZVQMKSVELR7GND7ANCNFSM4PTZ4TDA.

lhumphrey commented 4 years ago

@bobkr That is interesting, and it suggests that the problem is with the communications bridge, not OpenUxAS or AMASE per se. OpenUxAS to/from AMASE is using TCP/IP. I sometimes have these types of issues on a VM versus a native machine, but I work mostly on a VM, and these types of errors are relatively rare for me.

Unfortunately I can't give a lot of guidance on this. Maybe search for issues related to slow tcp/ip on VirtualBox?

bobkr commented 4 years ago

FYI:

I got a native Ubuntu system and do not have the null pointer issues with it. So far both example I was having trouble with are working.

One note on the installation.

  1. E3-core is missing from your dependencies, and must be installed by hand.
  2. One of your scripts uses ‘python’ instead of ‘python3’. I have to create a symbolic link python => python3 to get the build to work.

From: lhumphrey notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Thursday, August 6, 2020 at 6:40 AM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

@bobkrhttps://github.com/bobkr That is interesting, and it suggests that the problem is with the communications bridge, not OpenUxAS or AMASE per se. OpenUxAS to/from AMASE is using TCP/IP. I sometimes have these types of issues on a VM versus a native machine, but I work mostly on a VM, and these types of errors are relatively rare for me.

Unfortunately I can't give a lot of guidance on this. Maybe search for issues related to slow tcp/ip on VirtualBox?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-669932781, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD2BGU326TRMK6BBXRDR7KXDRANCNFSM4PTZ4TDA.

manthonyaiello commented 4 years ago

@bobkr for your (1), are you installing through OpenUxAS-bootstrap? We do install e3-core as part of the bootstrap install, so I'm surprised to hear that you had to install it by hand. Can you confirm that you're using the latest revision and open an issue on OpenUxAS-bootstrap if the issue persists?

Likewise for (2), is the script that references "python" in OpenUxAS-bootstrap? Or here on OpenUxAS? Can you confirm that you're at the latest revision and open an issue on whichever of the repositories has the problem? This one, in particular, I think we fixed recently, so I'm surprised to hear it gave you trouble.

Thank you in advance!

bobkr commented 4 years ago

I did a fresh install (OpenUxAS-bootstrap) this morning, so it is the latest.

It may be that the 1st time through the build I didn’t add the python link. If that was needed to install e3, then that might have been the issue. I installed 3e with pip install by hand, and got farther in the build before I realized I had to add the python link. I was getting a missing reference error for uuid_generate, so I was off looking for packages to install before I found the missing python issue.

Before adding the python link, I would get a one-line message in the output saying missing python. It is well hidden in the 100s of lines of output.

From: "M. Anthony Aiello" notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Tuesday, August 18, 2020 at 11:04 AM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

@bobkrhttps://github.com/bobkr for your (1), are you installing through OpenUxAS-bootstrap? We do install e3-core as part of the bootstrap install, so I'm surprised to hear that you had to install it by hand. Can you confirm that you're using the latest revision and open an issue on OpenUxAS-bootstrap if the issue persists?

Likewise for (2), is the script that references "python" in OpenUxAS-bootstrap? Or here on OpenUxAS? Can you confirm that you're at the latest revision and open an issue on whichever of the repositories has the problem? This one, in particular, I think we fixed recently, so I'm surprised to hear it gave you trouble.

Thank you in advance!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-675629855, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD46RXQI54BIW7ZIXRDSBK7BDANCNFSM4PTZ4TDA.

manthonyaiello commented 4 years ago

@bobkr Okay. That's definitely not right. I'll take a look to see if I can identify where the bad "python" is. The automatic install failing because of the bad "python" is responsible for your other errors.

I'll let you know when I have a fix for this.

bobkr commented 4 years ago

FYI: I tried test 06 with this error on startup:

(vpython) bob@bob-MacBookPro:~/bootstrap$ ./run-example 06_AutomationDiagram Using run-example in /home/bob/bootstrap/sbx/x86_64-linux/uxas-release/src.

<>RoutePlannerVisibility::Vehicle Id [3] turnRadius_m[135.6] nominalMaxBankAngle (deg) [20] nominalSpeed_mps[22] <>RoutePlannerVisibility::Vehicle Id [4] turnRadius_m[135.6] nominalMaxBankAngle (deg) [20] nominalSpeed_mps[22] <>RoutePlannerVisibility::Vehicle Id [5] turnRadius_m[135.6] nominalMaxBankAngle (deg) [20] nominalSpeed_mps[22] <>RoutePlannerVisibility::Vehicle Id [6] turnRadius_m[135.6] nominalMaxBankAngle (deg) [20] nominalSpeed_mps[22] Aug 18, 2020 3:12:22 PM avtas.lmcp.LMCPXMLReader setValue SEVERE: null java.lang.NumberFormatException: For input string: "" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68) at java.base/java.lang.Long.parseLong(Long.java:709) at java.base/java.lang.Long.parseLong(Long.java:824) at avtas.lmcp.LMCPXMLReader.readPrimitive(LMCPXMLReader.java:204) at avtas.lmcp.LMCPXMLReader.setValue(LMCPXMLReader.java:145) at avtas.lmcp.LMCPXMLReader.readXML(LMCPXMLReader.java:73) at avtas.amase.scenario.MessageManager.getNextEvent(MessageManager.java:104) at avtas.amase.scenario.ScenarioManager.step(ScenarioManager.java:148) at avtas.amase.scenario.ScenarioManager.initScenario(ScenarioManager.java:104) at avtas.amase.scenario.ScenarioManager.initializeComplete(ScenarioManager.java:225) at avtas.app.Context.initialize(Context.java:363) at avtas.app.Application.createApplication(Application.java:120) at avtas.app.Application.createApplication(Application.java:82) at avtas.app.Application$3.run(Application.java:247) at java.desktop/java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:313) at java.desktop/java.awt.EventQueue.dispatchEventImpl(EventQueue.java:770) at java.desktop/java.awt.EventQueue$4.run(EventQueue.java:721) at java.desktop/java.awt.EventQueue$4.run(EventQueue.java:715) at java.base/java.security.AccessController.doPrivileged(AccessController.java:391) at java.base/java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:85) at java.desktop/java.awt.EventQueue.dispatchEvent(EventQueue.java:740) at java.desktop/java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:203) at java.desktop/java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:124) at java.desktop/java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:113) at java.desktop/java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:109) at java.desktop/java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101) at java.desktop/java.awt.EventDispatchThread.run(EventDispatchThread.java:90)

1597788745961 WARN: - automation request ID[91] was not ready in time and was not sent.

From: "M. Anthony Aiello" notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Tuesday, August 18, 2020 at 12:03 PM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

@bobkrhttps://github.com/bobkr Okay. That's definitely not right. I'll take a look to see if I can identify where the bad "python" is. The automatic install failing because of the bad "python" is responsible for your other errors.

I'll let you know when I have a fix for this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-675658438, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD5ISDFYLAZU6P4EHM3SBLGBRANCNFSM4PTZ4TDA.

bobkr commented 4 years ago

I think you should add a step before the anod builds in the README to use the virtual python environment. That might solve the python issue.

i.e.

~/bootstrap$ source vpython/bin/activate

I was doing it before running the examples, but not before running the builds. I re-installed everything again, but still get the same error running the 06 example.

From: "M. Anthony Aiello" notifications@github.com Reply-To: afrl-rq/OpenUxAS reply@reply.github.com Date: Tuesday, August 18, 2020 at 12:03 PM To: afrl-rq/OpenUxAS OpenUxAS@noreply.github.com Cc: krivacic krivacic@parc.com, Mention mention@noreply.github.com Subject: Re: [afrl-rq/OpenUxAS] WatchTask crashes AMASE with null pointer exception. (#3)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

@bobkrhttps://github.com/bobkr Okay. That's definitely not right. I'll take a look to see if I can identify where the bad "python" is. The automatic install failing because of the bad "python" is responsible for your other errors.

I'll let you know when I have a fix for this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/afrl-rq/OpenUxAS/issues/3#issuecomment-675658438, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQOSAD5ISDFYLAZU6P4EHM3SBLGBRANCNFSM4PTZ4TDA.

manthonyaiello commented 4 years ago

@bobkr

Regarding the error for example 06, could you open a separate issue for that, please? That's unrelated to the build-related issues.

Regardng the vpython activation, that's what this command does:

eval "$( ~/bootstrap/install/install-anod-venv --printenv )"

That needs to be run for each new shell or, ideally, placed in your profile.

I will add the creation of a troubleshooting section for the README to my to-do list.

manthonyaiello commented 4 years ago

@bobkr

I set up a fresh VM to test the install issues you've been seeing.

  1. I was not able to replicate the problem with "python" rather than "python3". The installer ran to completion with no errors.
  2. I see that there is a bug in the --printenv command for the installer; it seems to be sensitive to the directory from which the command is run. That's clearly a problem; I will fix it. I've opened https://github.com/afrl-rq/OpenUxAS-bootstrap/issues/2 to track this. I'll let you know when I have a fix.
lhumphrey commented 3 years ago

Fresh VM seems to solve the null pointer exception, which we were not able to duplicate. https://github.com/afrl-rq/OpenUxAS-bootstrap/issues/2 updates the documentation to make it clear that install-anod-venv --printenv needs to be run from ~/bootstrap