NICMx / Jool

SIIT and NAT64 for Linux
GNU General Public License v2.0
320 stars 66 forks source link

--pool4 --address could accept prefixes in CIDR notation #117

Closed toreanderson closed 9 years ago

toreanderson commented 9 years ago

When assigning an IPv4 pool, the command line utility won't let me specify the pool in CIDR notation:

$ jool --pool4 --add --address 185.47.42.0/24
Cannot parse '185.47.42.0/24' as a IPv4 address.

(BTW: There's a spelling mistake there, it should be «an IPv4 address».)

I can of course get around this by adding the addresses one by one in a shell loop, e.g.:

$ for n in {0..255}; do sudo jool --pool4 --add --address 185.47.42.$n; done
The address was added successfully.
[repeated 254 times]

However, this does feel a bit clunky, and it does take quite a bit of time (the above command took 15 seconds to run). It gets worse if you need a big pool4 to avoid port exhaustion, e.g., adding a /16 in this way would take over an hour. Is there any particular reason why --pool4 --address won't accept CIDR notation?

Tore

ydahhrk commented 9 years ago

Because we forgot early in development. Indeed, we need this improvement.

Also, the reason why it takes so long is because of the lousy "pool of numbers" implementation. This and issue #36 go hand-by-hand.

magg commented 9 years ago

I'd like to work on this issue

Here's my idea:

I see there are more issues related to this one. Is the poolnum module going to be rewritten?

ydahhrk commented 9 years ago
  • Validate the CIDR format in the user space application
  • Send the IP address and the maskbits to the kernel module
  • Add another method in pool4.c to get the list of IP addresses within the subnet and add them in a loop

Sounds good, thanks.

We would address the poolnum rewrite if we finish the other issues early. However, this is very unlikely, so it'll probably end up in milestone 3.4.0.

magg commented 9 years ago

Hi Alberto,

Is there a way to specifically handle IP's already inserted in the pool4? I see that the -EINVAL return code is in several points inside the method, maybe something else should replace that return code?

log_err("Address %pI4 already belongs to the pool.", addr); return -EINVAL;

While doing some testing to fix this issue, my loop can't continue if it encounters an IP that's already in the pool4:

root@ubuntu:/home/magg/NAT64/mod# jool -4 -d
192.0.2.2
  (Fetched 1 addresses.)
root@ubuntu:/home/magg/NAT64/mod# time jool -4 -a --address 192.0.2.2/24
Invalid input data or parameter (System error -7)

real    0m0.171s
user    0m0.000s
sys 0m0.006s
root@ubuntu:/home/magg/NAT64/mod# dmesg | tail
[24339.192513] Adding an address to the IPv4 pool.
[24339.193322] maxhosts: 256
[24339.193921] IP: 192.0.2.2
[24339.194747] Maskbits: 24
[24339.195678] Network: 192.0.2.0
[24339.196162] Broadcast: 192.0.2.255
[24339.196841] usable IP: 192.0.2.0
[24339.274031] usable IP: 192.0.2.1
[24339.343379] usable IP: 192.0.2.2
[24339.343729] Jool ERROR (pool4_register): Address 192.0.2.2 already belongs to the pool.
ydahhrk commented 9 years ago

That's weird. We should be using EEXIST there.

in pool4_register():

log_err("Address %pI4 already belongs to the pool.", addr);
return -EEXIST;

in pool4_cidr_range() something like:

error = pool4_register(temp);
if (error == -EEXIST)
    continue; /* whatever */
else if (error)
    return error;

Is that what you're looking for?

magg commented 9 years ago

yep thanks!

magg commented 9 years ago

I'm afraid that after some testing, adding a block of /24 still takes around 16 seconds and sometimes the kernel will run out of memory, adding a block of /16 will certainly crash the kernel

here's the stack trace: https://gist.github.com/magg/78c56d04b1f58aae9b43

Note: I'm using a vmware instance with 1GB of ram to do the testing

ydahhrk commented 9 years ago

It's to be expected. While the redundant netlink messages used to contribute to the lag, the main problem is poolnum.

The memory problem is caused by the massive and continuous arrays allocated by poolnum and the lag is caused by the shuffling immediately after. Both problems are #36's fault, but inserting addresses in CIDR notation makes it more horribly noticeable.

Edit: I just confirmed. It's not a kernel crash; it's a warning. The system is unable to allocate more memory, but it keeps functioning normally. If I remove Jool, Ubuntu's system monitor reports the memory goes back to normal.

magg commented 9 years ago

nice!

ydahhrk commented 9 years ago

Sorry it took so long. (Also, the glitch I told you by mail was nonsense on my part.) It's still going to take a couple of days before the release can be seen in jool.mx, and we will close all the milestone issues then.