Open samJ-bitsight opened 2 months ago
1:2:: is invalid if my reading of RFC5952 is correct. Where did you get an address that ended with :: in the first place? It was my understanding that you could not legally end an ip6 address with compressed zeros (except maybe if it’s the last of the 8 groups, though I’m not too sure about that)
Will defer to @robertdavidgraham to review because I believe he takes joy in pedantry like this, and I really don’t know IPv6 that well
If the address is invalid it should probably be thrown away and a warning issued
We are scraping various inputs of ipv6 addresses and all of them seem to output addresses ending in :: I don't necessarily know if 1:2:: is valid, but we will see more complicated addresses ending in :: as well, this is just an example
I mean the classic loopback ipv6 address is 1::
We are scraping various inputs of ipv6 addresses and all of them seem to output addresses ending in ::
May I ask where you're scraping them from? (just curious)
I don't necessarily know if 1:2:: is valid, but we will see more complicated addresses ending in :: as well, this is just an example
The only case that an ending of :: is allowed (based on my reading of the RFC) is where you have a group of 7 (non compressed) groups preceding it
I mean the classic loopback ipv6 address is 1::
You have that backwards- loopback is ::1
Are CIDR blocks of that form handled correctly? For example, 2001:10::/28
By the way, I'm not opposed to fixing the behavior if it's incorrect. I would actually like to help with that, but I'm not completely certain that that style of address is valid (I'm not certain it's invalid either, I was hoping someone that works with ipv6 regularly enough would come along and tell one of us why we're wrong...)
Another question, what do applications like ping/ping6, nmap, socat, etc. think about the example address (similar ones)
I would be surprised if the implementation in masscan wasn't taken from a sound implementation. But I do find myself often surprised, I guess
Can you give an example of the exact command you're using? So I can reproduce it and look into it?
I'm reading RFC4291 section 2.2 and I now believe that the notation/example you gave is, in fact, valid - just very rare outside of a range or CIDR context
I expect the issue in the parsing to be a caused by assuming addresses with trailing zeros to be network addresses - as is the case a very large percentage of the time - and as a consequence, there's a off by one when such an address isn't followed by either a slash (for a prefix length) or a dash (to delimit a range) immediately after
(If abcd::/8 doesn't parse, though, this guess is wrong)
What I mean is that in practice, representations like aaaa::, aaaa:bbbb::, etc. are almost always being used in a representation of the network portion of a CIDR string or first address in a range
Anyway, it may be rare for a host address to end with several groups of 0 bits but that doesn't mean it shouldn't parse correctly
As a work around, you may be able to append /128 to such addresses (or decompress them, obviously)
Regarding potentially fixing this: I looked at the parser and it's a very complex (to me) state machine that I'm actually not too enthusiastic about trying to full understand and fix. So after all this, I'm not sure I can help much. One of the maintainers may pick it up eventually :/
So, I don't actually have a system that even has an IPv6 interface- so I had to lie a little bit to masscan to get it to run
But I don't necessarily see the issue you're describing - can you provide some more detail about how you're invoking masscan and where you're seeing the truncation?
I tried to reproduce using a build from this repository (master branch) and a build from the ivre masscan fork (master branch), though they've converged for the most part
Here's what a I saw:
$ sudo /bin/masscan -ddd -p 1 1:: --source-ip 2001:3b8::1234 --router-mac cc:cc:cc:cc:cc:cc
[+] pcap: found library: libpcap.so
pfring: found 'libpfring.so'!
pfring: successfully loaded PF_RING API
pfring: found 'pf_ring' driver
pfring: found 'pf_ring' driver module
if: route: ' eth0' dst=0.0.0.0 src=0.0.0.0 gw=1.2.3.1 priority=0
if: route: ' eth0' dst=1.2.3.0 src=1.2.3.4 gw=0.0.0.0 priority=0
[+] interface = eth0
[+] if(eth0): pcap: libpcap version 1.8.1
[+] if(eth0): opening...
[+] if(eth0): successfully opened
[+] interface-type = 1
if:eth0: not receiving transmits
if:eth0: type=ethernet(1)
[+] source-mac = cc-ff-cc-ff-cc-ff
[+] source-ip = [2001:3b8::1234]
[+] router-mac-ipv6 = cc-cc-cc-cc-cc-cc
[+] if(eth0): initialization done.
Starting masscan 1.3.2 (http://bit.ly/14GZzcT) at 2024-09-08 17:00:22 GMT
Initiating SYN Stealth Scan
Scanning 1 hosts [1 port/host]
[+] starting transmit thread #0
[+] starting throttler: rate = 100.00-pps
THREAD: xmit: starting main loop: [0..1]
[+] starting receive thread #0
[+] transmit thread #0 complete
[+] THREAD: recv: starting main loop
[+] waiting for threads to finish
[+] exiting receive thread #0
[+] exiting transmit thread #0
[+] all threads have exited
Not, it's not the most realistic recreation of a scan of 1::
, but it did parse it and create a list of 1 IPv6 address. I checked the wire and it did in fact transmit a packet as instructed:
17:08:02.985277 IP6 (hlim 255, next-header TCP (6) payload length: 20) 2001:3b8::1234.55282 > 1::.tcpmux: Flags [S], cksum 0x4fac (correct), seq 1745348173, win 1024, length 0
What are you seeing on your side? And can you confirm you're using a build from master and not an older package from some distribution's package manager?
Hey! so I am using master, the problem is in the output once a valid address is found. if the found address ended with :: it outputs
open tcp port 1: 1725889082 where instead we would want it to say open tcp port 1:: 1725889082
We aren't using cidrs to scan the ipv6 space because it's so large. We have a list of single addresses that have been scraped from various locations, like dns resolutions, and we minify them to help save a little on space. The current work around we have is just when we pull from the masscan output we just check if the ip address ends in : we append the missing one
Hey! so I am using master, the problem is in the output once a valid address is found. if the found address ended with :: it outputs
open tcp port 1: 1725889082
where instead we would want it to say
open tcp port 1:: 1725889082
Hey! so I am using master, the problem is in the output once a valid address is found. if the found address ended with :: it outputs
open tcp port 1: 1725889082
where instead we would want it to say
open tcp port 1:: 1725889082
Ohhhh... well, I feel very silly then. I assumed the issue was in parsing the target list
I think what you're seeing should be an easy fix since it should be a simple issue in one (or all) of the relatively trivial output modules. I'm happy to take a quick stab at a fix and ask
A few things that you can help with:
1. Do you see that output on the command line output only, or do you see it with the output formats?
Quickest way to check:
masscan -oB out.bin <your usual scan params>
^-- exclude whatever -o option you normally use, only use -oB to capture to the compact/lossless binary format
Then you can do:
for x in D J X G; do
masscan --readscan out.bin -o$x out.fmt.$x
done
Check each to see of the issue manifests
Separately- it sounds like you may be doing this as part of an automated workflow- which is also my use case. If you're interested in tracking the status of the scan as it runs based on the console output, I submitted #564 (--ndjson-status) some years ago, which makes all the console stdout/stderr output line-buffered NDJSON. It beats the heck out of having regex to parse the status, if that's what you're currently doing. Aside from "FYI, it may be helpful to you", I'm also curious if the issue manifests there as well. That's something you would have to manually eyeball in a real scan (or I suppose you could 2>scan.stderr and grep)
I would check these things myself but as I mentioned I don't have any ipv6 capable kernels running (meaning I can't even assign a dummy address to an interface, I would have to first reboot) and given that, I can't think of a really quick way to fake a SYN|ACK response that doesn't involve writing code that received and transmits via raw sockets
Let me know what you see from those other output formats and then I'll have an easier time of finding the problematic code
Thanks
Hey! Yes I am sadly running this in an automated system where we are doing a form of regexing the output luckily we can pretty much just listen for any line that doesn't start with # to get all valid data, but I'll look into the ndjson-status and see if that helps us more!
Note for later...
The problem would likely be in ipv6address_fmt()
Most of the output modules use ipaddress_fmt() to get the ASCII representation of addresses. For ipv6 addresses, it calls ipv6address_fmt():
struct ipaddress_formatted ipaddress_fmt(ipaddress a)
{
struct ipaddress_formatted out;
stream_t s;
ipv4address ip = a.ipv4;
if (a.version == 6) {
return ipv6address_fmt(a.ipv6);
}
...
I'll try to find some time to link a simple test against this to see if I can reproduce the issue that way
static void
_append_ipv6(stream_t *out, const unsigned char *ipv6)
{
static const char hex[] = "0123456789abcdef";
size_t i;
int is_ellision = 0;
/* An IPv6 address is printed as a series of 2-byte hex words
* separated by colons :, for a total of 16-bytes */
for (i = 0; i < 16; i += 2) {
unsigned n = ipv6[i] << 8 | ipv6[i + 1];
/* Handle the ellision case. A series of words with a value
* of 0 can be removed completely, replaced by an extra colon */
if (n == 0 && !is_ellision) {
is_ellision = 1;
while (i < 13 && ipv6[i + 2] == 0 && ipv6[i + 3] == 0)
i += 2;
_append_char(out, ':');
/* test for all-zero address, in which case the output
* will be "::". */
while (i == 14 && ipv6[i] == 0 && ipv6[i + 1] == 0){
i=16;
_append_char(out, ':');
}
continue;
}
/* Print the colon between numbers. Fence-post alert: only colons
* between numbers are printed, not at the beginning or end of the
* string */
if (i)
_append_char(out, ':');
/* Print the digits. Leading zeroes are not printed */
if (n >> 12)
_append_char(out, hex[(n >> 12) & 0xF]);
if (n >> 8)
_append_char(out, hex[(n >> 8) & 0xF]);
if (n >> 4)
_append_char(out, hex[(n >> 4) & 0xF]);
_append_char(out, hex[(n >> 0) & 0xF]);
}
}
Now, let's see what other applications do - this isn't exactly an unusual operation (and there was reference up the call stack to borrowing code from another project)
tcpdump
It does seem that the implementation in tcpdump has a case explicitly for trailing zeros
This should be relatively simple to fix
@samJ-bitsight can you be more explicit with regard to how exactly you're invoking masscan?
-oG
, -oD
, ...)masscan
? Basically, is there a possibility that something downstream of masscan
is the culprit?Asking that third one because I added a full set of tests for the address formatting functions in #801 and included some of the addresses you provided as problematic, but they all passed
Perhaps I'm missing some obvious case where the address is not formatted by that function - if I know exactly where you're seeing it, I can try to see if it's somehow going around it
If an address I input ends with :: (example 1:2::) masscan will output the ip as 1:2: which is invalid.