c172p-team / c172p

A high detailed version of the Cessna 172P aircraft for FlightGear
GNU General Public License v2.0
80 stars 43 forks source link

MP completely broken now(?) #440

Closed debdog closed 9 years ago

debdog commented 9 years ago

Using FG 3.6-rc1 a friend and I had problems flying together over MP with the new c172p. I (callsign D-HUND) am on Devuan Jessie, FG 3.6-rc1 compiled from git. The other guy (D-IO70) is on Win8.1 with FG 3.6-rc1 from nightly builds and fgdata release/3.6.0 branch copied from my PC.

First of all I need to say, with any other plane we've tried, including the old c172p, everything works as usual.

Symptoms: We were both starting FG with the new c172p. We were not able to see each others, not even appeared on each others pilot list. Any other aircraft within the 100 nm range was visible. Both of us were connected to mpserver12 and, according to the server, we were both online. This issue is reproducible anytime, so, if you need more info we'll kindly provide it.

My log, loglevel info, is available here: http://beggabaur.rocks/fgfs/fgfs.log-c172p_neu.txt I think the interesting part starts at line 730, though I am not very good with reading logs. Providing one with loglevel debug is no problem, just shout.

Thanks Alex

wkitty42 commented 9 years ago

@debdog your log seems to show that you have an old copy of the c172p... are you using one from the official FGDATA repository or one from the repo here on github?

line 749: io:5:/pub/fgfs/src/flightgear.git/src/Main/fg_commands.cxx:1166:loadxml: Cannot find XML property file 'Aircraft/c172p/Models/Liveries/4X-CHV (HD livery).xml'.

if you are using the one from the official FGDATA repository, when was the last time you updated it? when ever i update my FGFS, my script pulls all relevant repos from SF... that includes FGDATA and also FGADDON but FGADDON is not necessary...

gilbertohasnofb commented 9 years ago

The file 4X-CHV (HD livery).xml has been renamed to 4X-CHV_HD_livery.xml since long time ago and both our repository and also the plane in the official FGDATA repository show this new name. It seems that the problem is indeed that you are using an older version of the plane. Please update it by using the latest one from our repository or simply using the default one that comes with FGDATA and then report back if you are still facing problems.

debdog commented 9 years ago

I am using fgdata from git, branch remotes/origin/release/3.6.0, which seems to be up to date:

$ git fetch $

$ git rebase origin/release/3.6.0 Current branch 3.6 is up to date.

Latest commit according to "git log": commit 7fc0e42a8b9bc9394db001f7435907d0285e4a24

gilbertohasnofb commented 9 years ago

That's really strange, you should be up to date, but I also just checked the 3.6.0 branch at fgdata and the name of that file is correct! See: https://sourceforge.net/p/flightgear/fgdata/ci/release/3.6.0/~/tree/Aircraft/c172p/Models/Liveries/ Also, your log says that a lot of properties we use in this plane are not being recognized (e.g. Recording non-existent property '/orientation/alpha-deg').

Can you please check something for me? Go to the folder c172p/Models/Liveries/ and see if you have the correct filenames with underlines such as 4X-CHV_HD_livery.xml or the old ones with spaces such as 4X-CHV (HD livery).xml

gilbertohasnofb commented 9 years ago

Also: do you have another version (older) of the c172p installed somewhere else on your computer and which may be conflicting with the fgdata one?

wkitty42 commented 9 years ago

what @gilbertohasnofb said! plus, how are you starting your FGFS? with the new --launcher or via some other method?

if you are using the new --launcher, if you hover over the craft's name in the craft list, it will tell you which directory is being used in a tool tip popup... this can be helpful to find which one you may have that is conflicting if that is the problem...

debdog commented 9 years ago

$ ls /pub/fgfs/fgdata/Aircraft/c172p/Models/Liveries/4X*xml /pub/fgfs/fgdata/Aircraft/c172p/Models/Liveries/4X-CHV_HD_livery.xml

I have two additional fg-aircraft paths specified. None of them includes an c172p directory. I had the one in fgdata/Aircraft replaced by the old one for testing after we stumbled on that issue. Also: $ cat ~/.fgfs/autosave_3_6.xml | grep c172p

c172p-set.xml
      <path type="string">/pub/fgfs/fgdata/Aircraft/c172p</path>

And, IIRC the log I posted, the proper path was used by FG.

I am starting FG via a bash script[1] the Win8.1 guy uses the new built-in launcher. Since we're seemingly the only ones with this issue (Or ain't we? I have not yet found another 3.6 tester to confirm/decline that.) and he's using fgdata from my local branch, it could be a local issue. Sadly I have a slow internet connection and am not able to quickly download the fgdata-3.6-rc1 package for testing.

[1] http://beggabaur.rocks/fgfs/fg36.sh

gilbertohasnofb commented 9 years ago

Well, you seem to have the correct version of the plane given that your livery file is named correctly and that you have pulled the latest commit from FGDATA branch 3.6.0. Could it be that your friend's plane version is different and that your system is somehow thinking he has the newest c172p and looking for the livery file with the old name?

I honestly can't think of anything else except a local problem as I didn't hear of this problem before (@onox and I made some MP tests with the plane just a couple of days before the FGDATA merge and everything was fine with us).

But maybe @onox would have some other advice for you to troubleshoot this,

onox commented 9 years ago

If the other person is not visible in the pilot list (not orange, not white, not yellow) then it's not an issue with the aircraft, but with the server. Try one of the other mpservers.

gilbertohasnofb commented 9 years ago

@debdog any news on this issue? Did you manage to test it again?

geoffmcl commented 9 years ago

@debdog Not sure if this has been solved but had no problems between git 3.7 c172p between Ubuntu 14.04 (64-bit) and Win 7 (32-bit), using mpserver01. If still open, as @onox suggested, would try other mpservers...

Have not seen any code changes in this mp area between 3.6 release and 3.7 current next... so should be the same...

gilbertohasnofb commented 9 years ago

Since this issue has been without further response from the OP for some days and since nobody else is seeing the problem, I will close this issue now. If the OP or anyone else wants me to reopen it, please let me know and also please add more information about the possible bug.

debdog commented 9 years ago

Ok, I've been monitoring this situation during the last weeks and it is hard to reproduce. Also testing was a bit tedious since I didn't find people willing to do extensive testing. On mpserver12 the problem shows up immediately. The server maintainer has no clue why that would be and what could be different to the other servers. On mpserver03 the c172p disappeared after about 20 minutes of flying. A quick test on mpserver04 revealed no issues within about 10 minutes. But I met some guy online, cannot remember the callsign, though, who had more experience with the new c172p. And he confirmed, sometimes he just seem to vanish from the other pilot's sight. Given I have some spare time I am willing to do some more testing. I have a very slow internet connection so I am not able to properly run two instances of FG connected to different servers. If you have any idea in which direction I could investigate, please give me a hint. Thanks!

EDIT: forgot to mention, I am running FG-3.6-RC atm.

wlbragg commented 9 years ago

Something tells me this is an MP infrastructure issue and not related to the c172p.

debdog commented 9 years ago

Since I have no similar issue with any other craft so far, something tells me this is related to the c172p.

Juanvvc commented 9 years ago

It might be related to the MP infrastructure, but my bet is on different pilots flying different versions of the c172p.

The new c172p shares a lot more data with MP servers than the old c172p. Unfortunately, there is no way to identify if other pilot in MP is running the new c172p or the old one. This means that, probably, if different pilots are flying different versions of the aircraft in the same vicinity they are not going to understand each other. Any weird behaviour can be expected: error messages, weird visuals in any of the aircraft or even complete crashes.

Of course, I'm just giving my opinion without testing if it is true :)

onox commented 9 years ago

On mpserver03 the c172p disappeared after about 20 minutes of flying.

Do you define "disappeared" as:

  1. Pilot is NOT visible in pilot list and NOT visible visually
  2. Pilot is visible in pilot list, but NOT visible visually

?

Unfortunately, there is no way to identify if other pilot in MP is running the new c172p or the old one.

If the propeller of the other plane does not spin (while in the air), then it is an old version.

debdog commented 9 years ago

Oy Onox, sadly at this time we were just flying without any intend of bug-hunting and I don't know whether I also vanished from the other guy's pilot list. Just to say, I was using the new c172p and he accidently picked an older version. He was visible to me all the time. At some airport we met another pilot also using the new c172p which was visible to both of us, even though my friend was not able to see mine. But I'll try to get him do some more flying in the upcoming days. He's working odd shifts, though, so it might not be as easy to get him online.

debdog commented 9 years ago

Ok, some more info: first, as stated above, if the c172p is not visible it does not show up on the pilot list either. However, and I can not stress this enough - again, it is visible on mpmap02 and reported as online by the servers.

For you guys to reproduce the problem: start two instances both on mpserver01. Sometimes everything is ok, mostly the other c172p is not visible. If not visible we switched between servers 01, 03, 04 and 12 until we had a visual. Sometimes being on the same server solved it, sometimes we needed to be on different servers. We both were using the basic c172p from 3.6 but I had an additional livery[1] installed and using it, if that matters.

Later I eventually found a knowledgeable guy on IRC willing to dig deeper into the issue. He was using the Ufo watching the log as I (D-HUND) showed up with the c172p: FGMultiplayMgr::MP_ProcessData - message from D-HUND has invalid length!

I had to comment a lot of stuff of the c172p's files until we made that message disappear. Eventually we reached a point where I had all the livery XMLs removed except the 4X-CHV_HD_livery.xml. And inside there I had all the

4X-CHV (HD livery) altered to 4X-CHV(HD livery). And after said message disappeared I was visible to the ufo pilot. We did some testing with the name but could not figure out why the first whitespace was fatal but the second wasn't. [1] http://liveries.flightgear.org/liveries.php?id=970
onox commented 9 years ago

FGMultiplayMgr::MP_ProcessData - message from D-HUND has invalid length!

This is message is found in flightgear in src/MultiPlayer/multiplaymgr.cxx. Somehow the server is sending packets to the other guy with a different size than what it claims it is in its header. The code in flightgear therefore drops the packet, which means you are no longer visible visually and no longer in the pilot list.

What I find very strange is why this isn't happening all the time, but sometimes. It could be that the packet is simply too large. What kind of OS do you and the other guy have? (Linux, Windows, etc.)

Can you remove that additional livery you installed? If you compare it with the default liveries, you see it's quite incomplete.

When you managed to make the message disappear, did you restore (uncomment) "a lot of stuff"?

(It would be interesting to know if fgms (the server) is doing this static_cast<ssize_t>(MsgHdr->MsgLen) != bytes check as well.)

onox commented 9 years ago

@geoffmcl I see you have commit access to fgms. Can you help with determining whether it is the server or the sending client (D-HUND in this case) that is sending an invalid packet?

debdog commented 9 years ago

What kind of OS do you and the other guy have? (Linux, Windows, etc.)

I am on GNU/Linux Devuan/Jessie. The guy from the first test, where we jumped servers, is on Windows 8.1. The Ufo pilot is using GNU/Linux Gentoo.

Can you remove that additional livery you installed? If you compare it with the default liveries, you see it's quite incomplete.

Ok, I am tempted to say "WAT". In what way is it incomplete? It worked well with the old c172p and since the new one still ships with some of the old liveries (created with the same paint kit), what's your point?

When you managed to make the message disappear, did you restore (uncomment) "a lot of stuff"?

Of course, including the extra livery. I want my branch to not have (intentionally) messed up files. Thanks for asking!

geoffmcl commented 9 years ago

Yes @onox the fgms servers do have some packet length checks, but not that the packet reports a different size than it physically is...

See https://gitlab.com/fgms/fgms-0-x/blob/master/src/server/fg_server.cxx#L1005 for the validation done... there is no if( (int)MsgLen != Bytes ) check!

And note the decode of MsgLen is XDR_decode<uint32_t> ( MsgHdr->MsgLen );, not a simple cast... although that too would work if the network byte order matched the endianess of the machine...

So as far as I can see each, every fgms server would be forwarding all packets it receives, from a connected LOCAL fgfs instances, onto other pilots within the configured range, and onto relays/hub/crossfeeds/... that pass this simple validity test...

That suggests the fgfs instance has generated this invalid packet! But it has not been rejected by any fgms server...

It seems @debdog approach of cycling to different servers is a very good, albeit probably very tedious, approach to narrowing this down... you might like to include server 14 in some of the testing...

And it would be interesting to also know if any pilot not seen is in this list http://crossfeed.freeflightsim.org/flights.json?

That list powers this moving map display http://geoffair.org/fg/map-test2/map-test.html, so again when there is a problem are both aircraft shown on this map?

I have been monitoring this thread, and will help in any way I can...

debdog commented 9 years ago

Thank you very much @geoffmcl for your reply!

It seems debdog approach of cycling to different servers is a very good, albeit probably very tedious, approach to narrowing this down...

Hehe, true. And I've only mentioned it so the c172 devs are able to reproduce the problem. I am feeling a little alone since so far no one here confirmed that behaviour. And instead of looking into the issue everybody seem to just blame it on something else.

you might like to include server 14 in some of the testing...

Is there a special reason to do so? Just asking. I think we hopped on 14 once and our c172p's weren't visible to us.

And it would be interesting to also know if any pilot not seen is in this list http://crossfeed.freeflightsim.org/flights.json?

Nice! I thought of telnet-ing to the servers to get a pilot list but the last time I've tried that (quite a long time ago) that wasn't very reliable. I'll your links the next time we're flying, thanks!

geoffmcl commented 9 years ago

@debdog yeah for sure tell me when next flying... I can put up a few birds...

But it would be better if you hit me with a direct email with a different subject... it was 7 hours today before I got through other stuff and read this... all c172-detailed emails are auto-diverted to their own folder, so are not in main inbox...

debdog commented 8 years ago

Oops-a-daisy, completely forgot about this issue. Sorry, hadn't had much time for FG in the past two months. But since FG 3.6 has been postponed it's all not that urgent anymore. Will try to get up to date with FG during the upcoming week and then continue with testflying.

gilbertohasnofb commented 8 years ago

For those who dont follow the forum, there has been some reported problems about the MP and crashes, see http://forum.flightgear.org/viewtopic.php?f=4&t=25157&start=2535#p271381