m-parashar / adbhostgen

Script to generate massive block lists for DD-WRT
GNU General Public License v3.0
25 stars 7 forks source link

Segmentation fault during setup #3

Open Paradox opened 5 years ago

Paradox commented 5 years ago

VERSION: 20180331 on Netgear R6400 running Linux DD-WRT 4.4.159 #4001 SMP Wed Oct 10 09:28:16 CEST 2018 armv7l DD-WRT (build 37305)

Size of /tmp/mphosts.tmp: 9.9M Size of /tmp/mpdomains.tmp: 3.6M

Processing blacklist/whitelist files Processing final mphosts/mpdomains files Segmentation fault Removing temporary files Size of /jffs/dnsmasq/mphosts: 0 Size of /jffs/dnsmasq/mpdomains: 3.1M Number of ad domains blocked: approx 0

result of free total used free shared buffers cached Mem: 254576 51496 203080 0 6292 22312 -/+ buffers/cache: 22892 231684 Swap: 487420 0 487420

For testing purposes, I used -0 to see if a smaller file would work, but, same problem.

I've narrowed it down to this line: LC_ALL=C cat $tmphosts | sed -r 's/^[[:blank:]]//; s/[[:blank:]]$//; /^$/d; /^\s*$/d' | tr -cd '\000-\177' | cat tmpbl - | grep -Fvwf tmpwl | sort -u | awk -v "IP=$ADHOLEIP" '{sub(/\r$/,""); print IP" "$0}' > $mphosts

if I replace $tmphosts with $tmpdomains it completes, successfully, so, something is going on with the $tmphosts file... I think I'm going to add some code to split the file up, into smaller files, and, see if it can process them, and then concatenate the files back into one file, when complete, or, maybe, narrow it down to some error with one (or, more) of the hosts in the file...

OK, so, I added some clunky code to break the mphosts.tmp file into smaller files, process each one, and add them to the final mphosts file, and, this is what happened, first, I broke it into files with 100000 lines in each, and, a couple of them, segfaulted, so, i dropped it to 50000 lines, each, and, they finished up, without segfaulting, and this seems to have worked, although, I have not yet verified that everything works as expected, yet.

OK, I cleaned up my code, to the best of my abilities, and, it ran it several times, and, it seems to be working fine, I haven't tried changing the number of lines above 50000 to see where the actual limit is, although, from my previous tests, it seems, more than 2 MB worth of lines was about where the segfaults were occurring. If anyone is interested in my code, and, would like to help me make sure it's good enough to make a pull request, just let me know, here.

Paradox commented 5 years ago

Well, another segmentation fault, after further testing... bummer.

so, it doesn't seem to be a size issue, i'm thinking there are badly formed hosts in the list, or something.

Paradox commented 5 years ago

I found that the actual part causing the segmentation fault comes from | cat tmpbl - | grep -Fvwf tmpwl | sort -u

So, now, I'm looking into why it is happening, and, how to avoid it.

I've narrowed it down to the sort command, but, I'm still not 100% sure why it happens, and, only on one segment of the file, and, no others.

Paradox commented 5 years ago

Well, I lowered the line count, again, this time to 10000, took forever, but, completed, without a segfault, only problem now, is that it runs fine, for a while, and, then stops responding, if I reboot, computers can't get an address from the dhcp server, and, I have to reset the router to connect, again. Sometimes, I can use a static IP on my computer, and, I also found that if I use the pause argument, the router will start working, again, but, of course, it doesn't block ads, so, I guess the large mphosts file is just too much for this router, and I may have to look into customizing the list to use more domains, instead of hosts? or something, to cut down on the size of the file. But, if nothing else, it has given me something to tinker with, and, when I get ahold of another, more capable router, I will have a good idea how it works. Maybe I'll even adapt it and use one of my Linux machines as my DNS server, instead.

Paradox commented 5 years ago

Interesting... I removed the sort -u command, and disabled my code, it went completely through, with no segfault, but, when I opened the mphosts file in vi, I found some very interesting entries, at the start of the file, one example is 0.1.2.3 local I'm pretty sure, that shouldn't be there, and, may be the reason my router stops working, and starts working again, when I pause the mphosts file...

m-parashar commented 4 years ago

Thanks for the feedback. Any solutions yet? I have been a bit busy with other things and could not look into this.