donatengit / AppleIGB

49 stars 4 forks source link

Internet connexion lost after a few minutes #2

Open Leborgne23 opened 2 years ago

Leborgne23 commented 2 years ago

Hi, just to thank you for your work and let you know what words and what does not in my case :

donatengit commented 2 years ago

Hi @Leborgne23 Thanks for checking this out

Which link speed works and doesn't work in your case? Does the link stable e.g. no packets loss?

donatengit commented 2 years ago

One more thing could you please send what is detected in (Mac) -> (About) -> (Ethernet). Interested in Device/Vendor IDs

https://support.apple.com/en-gb/guide/system-information/syspr35536/mac

Leborgne23 commented 2 years ago

Hi Here are the info given by system report for the ethernet NIC, Wake on LAN and boot from LAN are disabled in bios.   Bus: PCI   Vendor ID: 0x8086   Device ID: 0x1539   Subsystem Vendor ID: 0x1458   Subsystem ID: 0xe000   Revision ID: 0x0003   PCIe Link Speed: 2.5 GT/s   PCIe Link Width: x1   Driver: com.amdosx.driver.AppleIGB   BSD Device Name: en2   MAC Address: 18:c0:4d:99:61:23   AVB Support: No   Maximum Link Speed: 1 Gb/s

Forcing 1000BaseT breaks connexion (local + internet), forcing 10baseT or 100baseTX is ok.

Thanks again.

Fabrice

On 14 Feb 2022 at 13:14 +0100, donatengit @.***>, wrote:

One more thing could you please send what is detected in (Mac) -> (About) -> (Ethernet). Interested in Device/Vendor IDs https://support.apple.com/en-gb/guide/system-information/syspr35536/mac — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

Leborgne23 commented 2 years ago

Well so far so good : more than 4 hours with no interruption whatsoever. Only thing I did since last time has been updating macOS to 12.2.1 Regards.

On 14 Feb 2022 at 14:00 +0100, Fabrice Gelis @.***>, wrote:

Hi Here are the info given by system report for the ethernet NIC, Wake on LAN and boot from LAN are disabled in bios.   Bus: PCI   Vendor ID: 0x8086   Device ID: 0x1539   Subsystem Vendor ID: 0x1458   Subsystem ID: 0xe000   Revision ID: 0x0003   PCIe Link Speed: 2.5 GT/s   PCIe Link Width: x1   Driver: com.amdosx.driver.AppleIGB   BSD Device Name: en2   MAC Address: 18:c0:4d:99:61:23   AVB Support: No   Maximum Link Speed: 1 Gb/s

Forcing 1000BaseT breaks connexion (local + internet), forcing 10baseT or 100baseTX is ok.

Thanks again.

Fabrice

On 14 Feb 2022 at 13:14 +0100, donatengit @.***>, wrote:

One more thing could you please send what is detected in (Mac) -> (About) -> (Ethernet). Interested in Device/Vendor IDs https://support.apple.com/en-gb/guide/system-information/syspr35536/mac — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

Leborgne23 commented 2 years ago

Nevermind, issue is back again after 5 hours. On 14 Feb 2022 at 15:30 +0100, Fabrice Gelis @.***>, wrote:

Well so far so good : more than 4 hours with no interruption whatsoever. Only thing I did since last time has been updating macOS to 12.2.1 Regards.

On 14 Feb 2022 at 14:00 +0100, Fabrice Gelis @.***>, wrote:

Hi Here are the info given by system report for the ethernet NIC, Wake on LAN and boot from LAN are disabled in bios.   Bus: PCI   Vendor ID: 0x8086   Device ID: 0x1539   Subsystem Vendor ID: 0x1458   Subsystem ID: 0xe000   Revision ID: 0x0003   PCIe Link Speed: 2.5 GT/s   PCIe Link Width: x1   Driver: com.amdosx.driver.AppleIGB   BSD Device Name: en2   MAC Address: 18:c0:4d:99:61:23   AVB Support: No   Maximum Link Speed: 1 Gb/s

Forcing 1000BaseT breaks connexion (local + internet), forcing 10baseT or 100baseTX is ok.

Thanks again.

Fabrice

On 14 Feb 2022 at 13:14 +0100, donatengit @.***>, wrote:

One more thing could you please send what is detected in (Mac) -> (About) -> (Ethernet). Interested in Device/Vendor IDs https://support.apple.com/en-gb/guide/system-information/syspr35536/mac — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

donatengit commented 2 years ago

Nevermind, issue is back again after 5 hours.

@Leborgne23 Problem is I can't fully reproduce 1Gb issues with my NIC as it autonegotiates 100Mb with my old router and all other options don't work -- not necessarily due to the driver. While I'm looking for a valid 1G partner for testing could you please test one more version: [DELETED] It has some dirty hacks to avoid resets without obvious reason so ifconfig enX up/down now works (in my case) but with a drawback that forcing speed/control dramatically reduces connection speed (unclear reason atm). Additionally it has far more logging please perform sudo dmesg | grep -i igb in console while manipulating the NIC's state through ifconfig for additional debug.

donatengit commented 2 years ago

Hi @Leborgne23

Could you please try a new version? It's supposed to be far more stable with link state changes.

Thanks in advance

Leborgne23 commented 2 years ago

Hi ! Thanks a lot for the notification. I’’ve been using the new version for an hour or so with no problem so far. I’ll keep you posted. Thanks for what you do. Fabrice

On 20 Feb 2022 at 19:49 +0100, donatengit @.***>, wrote:

Hi @Leborgne23 Could you please try a new version? It supposed to be far more stable with link state changes. Thanks in advance — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

NyaomiDEV commented 2 years ago

Hello there. I am also experiencing issues as per description with your latest version (the one you sent out one hour ago). I am trying to install Monterey, and the installation errored out because of the network connection cut off almost half an hour in.

I am using I211 on Asrock X570 Taichi.

donatengit commented 2 years ago

I am trying to install Monterey

Hi @NyaomiDEV, Thanks for trying this out. The driver is for Monterey specifically (built targeted 12.1) SmallTree supposed to work well on previous versions (meaning download/upgrade could be done without this driver).

You could manage which driver is used in each OS version by setting Min/MaxKernel options for every kext loaded in config.plist

NyaomiDEV commented 2 years ago

The driver is for Monterey specifically (built targeted 12.1)

I know. I am, indeed, trying to install Monterey from scratch, meaning that I am loading your kext on the macOS recovery. Please tell me if I am doing this wrong, though! I just want this suffering to end a working installation, so I can just keep trying all the kexts you can provide until I get the OS to install.

donatengit commented 2 years ago

I am, indeed, trying to install Monterey from scratch, meaning that I am loading your kext on the macOS recovery.

Oh, I've never tested the driver in this way. And I'm not sure which debugging options available during the process tbh. Is WiFi working? If so just disable the driver for a while P.s. Feel free to ping me in discord donniedisc#1988 (server) if that's more convenient

NyaomiDEV commented 2 years ago

Is WiFi working?

It's an Intel AX200, which I can use only after installing the system anyway. Can't count on it sadly.

(server)

We'll probably hear from each other in five minutes because of the Discord server cooldown.

Leborgne23 commented 2 years ago

Well I still have the issue using the new version sorry. Seems more stable if I force half duplex, maybe that can help. Thanks guys. On 20 Feb 2022 at 20:54 +0100, Fabrice Gelis @.***>, wrote:

Hi ! Thanks a lot for the notification. I’’ve been using the new version for an hour or so with no problem so far. I’ll keep you posted. Thanks for what you do. Fabrice

On 20 Feb 2022 at 19:49 +0100, donatengit @.***>, wrote:

Hi @Leborgne23 Could you please try a new version? It supposed to be far more stable with link state changes. Thanks in advance — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

donatengit commented 2 years ago

Well I still have the issue using the new version sorry. Seems more stable if I force half duplex, maybe that can help.

@Leborgne23, thanks.

  1. Is it still connection lost (i.e. cable unplugged or similar) or packets loss?
  2. What kind of network activity was that period: intense or almost none? It might be something with EEE power management
  3. Will you be able to run additional couple of commands in terminal when noticing problems?
  4. What link speed status is shown on the router, is this the same as autonegotiated/you force?
Leborgne23 commented 2 years ago

Thanks ! Here are may answers On 21 Feb 2022 at 11:10 +0100, donatengit @.***>, wrote: Well I still have the issue using the new version sorry. Seems more stable if I force half duplex, maybe that can help. @Leborgne23, thanks.

  1. Is it still connection lost (i.e. cable unplugged or similar) or packets loss? -> connection lost but OS thinks it’s connected. Web browser tries to establish connexion and gives up after 30 seconds or so.
  2. What kind of network activity was that period: intense or almost none? It might be something with EEE power management -> I tested it using p2p (torrent) downloading / uploading so I guess yes it was intense.
  3. Will you be able to run additional couple of commands in terminal when noticing problems? Yes I’ll do it, which ones ?
  4. What link speed status is shown on the router, is this the same as autonegotiated/you force? Yes

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

donatengit commented 2 years ago

I tested it using p2p (torrent) downloading / uploading so I guess yes it was intense.

Ok, it narrows the root cause, I guess. It's either inability to cope with the load (torrent is one of the most network intensive activities indeed) or can't detect/manage hangs properly (since counter-party is often unreliable).

  1. Will you be able to run additional couple of commands in terminal when noticing problems? Yes I’ll do it, which ones ?

Great, I'll prepare a debug version with additional logging around packets transmission and would ask you to run sudo dmesg | grep -i igb in terminal right after the problems occurs. But before I'll try to reproduce the issue myself with some torrents.

donatengit commented 2 years ago

Hi @Leborgne23,

I've tested the driver under high torrents load and indeed some packets was getting timeouted (less with patches below) but the overall download speed was constantly hitting maximum speed ISP allows. And the link was still stable unfortunately.

Anyway I applied several changes that might help:

  1. Explicitly rejecting packets when transmit queue is busy (before that it was kind of silent)
  2. Increased default queue capacity from 256 to 1024
  3. Added options to (un)select EEE mode (there are notes that disabling it could fix spontaneous link problems)
  4. Ensured software interrupt register in watchdog for rx ring cleaned

Could you please test AppleIGB.kext.zip ?

I recommend to test autonegotiated 1Gbs first and if the issue remains force 1GBps without EEE. Separately it would make sense to test limiting download/upload speed of your torrent client to 80-90% of your maximum ISP speed keeping space for other web/network activity (according to my tests torrents could take all of it).

As for additional debug, please run 2 terminals:

Leborgne23 commented 2 years ago

Hi I tested the attached version with no luck unfortunately. Here is the screenshot of the requested commands. Maybe something wrong with my OpenCore setup ? Thanks again. Fabrice

On 22 Feb 2022 at 13:45 +0100, donatengit @.***>, wrote:

Hi @Leborgne23, I've tested the driver under high torrents load and indeed some packets was getting timeouted (less with patches below) but the overall download speed was constantly hitting maximum speed ISP allows. And the link was still stable unfortunately. Anyway I applied several changes that might help:

  1. Explicitly rejecting packets when transmit queue is busy (before that it was kind of silent)
  2. Increased default queue capacity from 256 to 1024
  3. Added options to (un)select EEE mode (there are notes that disabling it could fix spontaneous link problems)
  4. Ensured software interrupt register in watchdog for rx ring cleaned

Could you please test AppleIGB.kext.zip ? I recommend to test autonegotiated 1Gbs first and if the issue remains force 1GBps without EEE. Separately it would make sense to test limiting download/upload speed of your torrent client to 80-90% of your maximum ISP speed keeping space for other web/network activity (according to my tests torrents could take all of it). As for additional debug, please run 2 terminals:

• one with ping 8.8.8.8 -- it constantly pings google and reflects time of response (it could show timeouts or increase in ms if torrents take all the bandwidth) • another with sudo dmesg | grep -i igb -- run this as soon as you see any problem and accumulate contents for further sharing •

Thanks in advance — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

donatengit commented 2 years ago

@Leborgne23 Thanks, forgot to attach the screenshot?

donatengit commented 2 years ago

Maybe something wrong with my OpenCore setup ?

Did SmallTree work well before?

Leborgne23 commented 2 years ago

I did attach the screenshot to the email, not in GitHub. Here it is again just in case. Thanks On 22 Feb 2022 at 19:55 +0100, donatengit @.***>, wrote:

@Leborgne23 Thanks, forgot to attach the screenshot? — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

Leborgne23 commented 2 years ago

I have no idea as I got this Mobo only last month and the only OS I used is Monterey, therefore with I got the issue everyone has with SmallTree. If that can help I can install Big Sur on an external disk, use SmallTree and report back. Fabrice On 22 Feb 2022 at 19:57 +0100, donatengit @.***>, wrote:

Maybe something wrong with my OpenCore setup ? Did SmallTree work well before? — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

donatengit commented 2 years ago

@Leborgne23

I did attach the screenshot to the email, not in GitHub.

Thanks but still don't see it for some reason.

If that can help I can install Big Sur on an external disk, use SmallTree and report back.

It's a good idea, please follow dortania guide carefully and while testing ensure no other network interfaces are enabled (including wi-fi).

But before that there is another version available, stabilizing output speed by stalling packets (as in IntelMausi driver)

llyonard commented 2 years ago

I'm having the most stable connection atm with the 5.7.2-im, with the newest 2 i keep having random disconnection every 2/3 min.

NyaomiDEV commented 2 years ago

I'm having the most stable connection atm with the 5.7.2-im, with the newest 2 i keep having random disconnection every 2/3 min.

On which hardware though?

llyonard commented 2 years ago

intel i211 controller. Its stable except under heavy upload load (download seems ok). The other 2 versions are really unstable in my config.

NyaomiDEV commented 2 years ago

intel i211 controller

On which chipset?

llyonard commented 2 years ago

AMD X570 aorus elite.

Edit: im having the same problems with that release too, was just lucky in some boots (still i dont know why)

thedxrklord commented 2 years ago

Same issue x570f gaming i211 controller Seems like it stops when I'm trying to create a new connection (for example, joining to discord voice channel)

donatengit commented 2 years ago

I'm having the most stable connection atm with the 5.7.2-im, with the newest 2 i keep having random disconnection every 2/3 min.

Hey, thanks for testing.

Great to hear that X570 is more or less confirmed working. Will you be able to use DEBUG version on latest driver and run sudo dmesg | grep -i igb right after disconnection? Generally please give it 5-10 seconds to establish connection.

Tbh I'm a bit surprised that 5.7.2-im is more stable than further versions since the it's not rejecting packets explicitly when output queue is out of capacity. What kind of workload do you have?

donatengit commented 2 years ago

The other 2 versions are really unstable

And what is meant by unstable in this context?

donatengit commented 2 years ago

Same issue x570f gaming i211 controller Seems like it stops when I'm trying to create a new connection (for example, joining to discord voice channel)

Which version do you use?

donatengit commented 2 years ago

Also which SMBIOS/mac type you declare in your config.plist?

donatengit commented 2 years ago

Based on some debug activities with @thedxrklord in Discord, please try changing connection mode from Auto to 100 or 1000mbps, Full-duplex, With or Without Flow-control. (Mac OS Settings -> Network -> Ethernet adapter (the i211 one) -> Advanced or Additional (sorry I don't have English UI, where TCP/IP or DNS tabs) -> Hardware tab.

It helped the guy. Let me know whether the connection is stable in your case. It might help me to narrow the issue

llyonard commented 2 years ago

Ok ill try this new setting in hardware config. For the other questions you're right, i reallly i dont have any difference with the newest versions, had just some luck with the old one. I'll report later if i have an unstable connection even with the changes in Ethernet-->Hardware.

donatengit commented 2 years ago

Thanks, it would be ideal if you run sudo dmesg | grep -i igb in terminal right after the issue occur. Feel free to pm me in Discord tomorrow for a quicker turnaround if it's more convenient.

thedxrklord commented 2 years ago

Hi everyone I've tested yesterday different settings in network advanced (thanks @donatengit) And 100baseTX is more stable, but I've got slow internet connection (10mbit/250mbit) BUT if drops 100baseTX I switch to 1000 back (ff,fc) and it is ok No idea why it happens, debug says everything the same

Just know, that auto mode drops it every 2-3 minutes, 100 is the stablest one

I'm still learning why it happens, I'll write here if I find something

Also, here is a bash script You can add it to autoload It will reboot your interface automatically, when it drops

#!/bin/bash

while true;do
ping -c 3 -t 1 -i 0.1 8.8.8.8 > /dev/null
if [ $? -ne 0 ]; then
    echo network down
    sudo ifconfig en0 down
    sudo ifconfig en0 up
    sleep 10
fi
done
llyonard commented 2 years ago

Thanks for the bash script, i can confirm that without auto is much more stable

NyaomiDEV commented 2 years ago

The bash script sure can help getting a sense of continuity out of normal browsing, but the issue at hand is disruptive in real time applications like Teams meetings.. btw, so is it not stable on 1000Mbit/s at all?

llyonard commented 2 years ago

2 days ago was really stable even with 1000mbit/s, today it disconnects every 3 mins again. Here the dmesg https://pastebin.com/DvRLJT3G

donatengit commented 2 years ago

Also, here is a bash script You can add it to autoload It will reboot your interface automatically, when it drops

Hi @thedxrklord ,

Thanks again for testing. I've just tested your approach on real macbook pro (2017) and not sure it reflects situation with connection consistently. Slightly modified script with exit code:

% cat check_link.sh 
#!/bin/bash

while true;do
ping -c 3 -t 1 -i 0.1 8.8.8.8 > /dev/null
PING_EXIT=$?
if [ $PING_EXIT -ne 0 ]; then
    echo "network down ? (code=$PING_EXIT)"
    sleep 10
fi
done

outputs following (under high network load in ~15 minutes):

tmp % ./check_link.sh 
network down ? (code=2)
network down ? (code=2)
network down ? (code=2)

So the script will perform long ifconfig down/up on pretty valid connection in some circumstances as it seems.

Even though it would make sense probably to switch your script to some local address first (e.g. router or ISP switch) to exclude ISP issues, please note that generally MacOS is not great on distributing network capacity between apps and services, one could take a lot from it. I've run ping -i 0.1 192.168.1.1 (local) under high network load (two 4K youtube videos, speedtest) and pings started to drop and response time dramatically increased. Here how it looks like on real mac with normal ping 1-2ms:

64 bytes from 192.168.1.1: icmp_seq=24443 ttl=64 time=158.267 ms
Request timeout for icmp_seq 24445
Request timeout for icmp_seq 24446
64 bytes from 192.168.1.1: icmp_seq=24445 ttl=64 time=305.311 ms
64 bytes from 192.168.1.1: icmp_seq=24447 ttl=64 time=169.409 ms
64 bytes from 192.168.1.1: icmp_seq=24448 ttl=64 time=91.876 ms
Request timeout for icmp_seq 24453
Request timeout for icmp_seq 24454
64 bytes from 192.168.1.1: icmp_seq=24452 ttl=64 time=357.367 ms
Request timeout for icmp_seq 24456

ifconfig down/up is comparatively expensive activity which could take up to 10 seconds. What kind of load/context you have while experiencing first issues: CPU, network load? What happens if you don't run your script?

donatengit commented 2 years ago

The bash script sure can help getting a sense of continuity out of normal browsing, but the issue at hand is disruptive in real time applications like Teams meetings.. btw, so is it not stable on 1000Mbit/s at all?

Hi @NyaomiDEV,

Thanks, are you testing this on BigSur? And SmallTree doesn't have these issues in the exactly same context, correct?

donatengit commented 2 years ago

2 days ago was really stable even with 1000mbit/s, today it disconnects every 3 mins again. Here the dmesg https://pastebin.com/DvRLJT3G

Hi @llyonard,

Thanks for testing. It's really unusual that the driver started to disconnect after 2 days of stable work, something has changed I guess, or it might help us to narrow the issue.

  1. First of all how do you understand the connection is dropped? From the logs provided I see manual switchs on/off nothing pointing that something is changed with link. You could run ping -i 0.1 8.8.8.8 to see the picture in real time and run call for debug logs right after packets are dropped constantly.
  2. Did you use reconnection script from this thread? Please see my reply above with checks from real mac.
  3. Do you have other network devices in your setup: another network chipset, wifi, iphone plugged via USB, ... ? Including virtuals e.g. VPN? Does the issue remain if everything else is switched off?
  4. Is there anything in the BIOS network related that could be a reason?
  5. Did your device go to suspend/sleep in the period? What kind of load did you have while experiencing the issue: high CPU, network and/or disk activity?
  6. What MacOS version do you have at the moment? Did you upgrade from Big Sur? Was SmallTree driver working there?
  7. Is there anything else not working properly at the moment e.g. wi-fi, bluetooth?
  8. What OpenCore version do you have?
  9. What mac device you declare in config.plist? Not sure this is related but I declare Mac Pro (2019) as the closest to my hardware, including Radeon drivers.

Thanks in advance

llyonard commented 2 years ago

1)Yes there are manual disconnects but i just did that for restart the connection 2)No 3)Yes, i have an intel ax200 but it is disabled 4) I cant see any but ill check more deeply 5)No, no suspend, i have it disabled cause isnt stable even in windows or linux 6)latest monterey (not beta or anything like that) 7)Sometimes when the ethernet is too unstable i activate the wifi and it loses connection but i have a more stable experience 8) latest release 9)Same mac device in config

donatengit commented 2 years ago

Hi @llyonard Thanks

1)Yes there are manual disconnects but i just did that for restart the connection

So in your terms, disconnect is a dramatic speed drop, packets loss or something else?

3) Yes, i have an intel ax200 but it is disabled

It's possible that issue might be caused by this but hard to say without the NIC to debug.

4) I cant see any but ill check more deeply

Any news?

Cryptiiiic commented 2 years ago

@donatengit I'm also having connection drops. I have I211 NIC on ASUS ROG Formula VII X570. I'm on Monterey 12.5. The only fix afaik is to up/down the interface using ifconfig but that's annoying to do all the time. In fact I had to reup the interface 3 times during the typing of this comment. There were no mentions of igb in dmesg log or any errors afaik. I do see a decent amount of connect() - failed necp_set_socket_domain_attributesconnect() logs but unsure if thats relevant to IGB.

llyonard commented 2 years ago

Hi @llyonard Thanks

1)Yes there are manual disconnects but i just did that for restart the connection

So in your terms, disconnect is a dramatic speed drop, packets loss or something else?

  1. Yes, i have an intel ax200 but it is disabled

It's possible that issue might be caused by this but hard to say without the NIC to debug.

  1. I cant see any but ill check more deeply

Any news?

Sadly i couldnt solve any of my problems so i was forrced to buy an usb to ethernet and since then i had 0 problems. Anyway im always open to help if you need more test for this project

Cryptiiiic commented 2 years ago

Found another log [15063.048857]: uipc_accept: peer disconnected unp_gencnt 30140Sandbox apply: mdworker_shared[3649] <bytes>Sandbox apply: mdworker_shared[3648] <bytes>Sandbox apply: mdworker_shared[3647] <bytes>compat_ifmu_ulist: en0 copyin() error 14c

Cryptiiiic commented 2 years ago
image

pinging google -> drop connection after 30 minutes