WinDivert 2.0 Wishlist - Githubissues

basil00 commented 6 years ago

This is the WinDivert 2.0 wishlist.

Some of the the main items I am interested are:

New Layer Support (see FLOW layer below): The idea is to support additional layers beyond NETWORK and NETWORK_FORWARD. These will be based on the underlying Windows Filtering Platform (WFP) Application Layer Enforcement (ALE) layers, which can monitor network events such as the creation and deletion of network connections (flows) made by programs. The main advantage of these layers is that the process ID is available, and this can be passed to the WinDivert application. I have prototyped a FLOW layer and details are provided below.
Filter language support for matching raw/payload bytes: This helps filters match application-level protocols, e.g. payload[0] == 22 and payload[1] == 0x03 can help select "TLS client hello" packets. This may help fix some bottlenecks (e.g., #140).
Batch-mode Support: Read multiple packets at once to reduce overheads.

Other ideas are welcome. I am aiming for a new release early 2019. It also depends on getting the new driver signed.

Flow Layer

The "FLOW" layer monitors the creation and deletion of all "connections" (a.k.a. flows) to and from the local machine. The process ID is available at the FLOW layer.

A prototype implementation is available here: https://github.com/basil00/Divert/tree/flow_layer

To open a handle at the FLOW layer, pass the WINDIVERT_LAYER_FLOW layer parameter to WinDivertOpen:

handle = WinDivertOpen(filter, WINDIVERT_LAYER_FLOW, 0,
    WINDIVERT_FLAG_SNIFF | WINDIVERT_FLAG_RECV_ONLY);

Note the SNIFF and RECV_ONLY flags are mandatory at this layer.

Flow events can now be detected using WinDivertRecv.

WinDivertRecv(handle, NULL, 0, &addr, NULL)

Note that the packet is NULL (there is no packet associated to flow events). Instead, all information is passed through the WINDIVERT_ADDRESS structure. The structure layout has been modified to accommodate support for different layers:

typedef struct
{
    INT64  Timestamp;                   /* Packet's timestamp. */
    UINT32 Layer:8;                     /* Packet's layer. */
    UINT32 Event:8;                     /* Packet event. */
    ...
    union
    {
        WINDIVERT_NETWORK_DATA Network; /* Network layer data. */
        WINDIVERT_FLOW_DATA Flow;       /* Flow layer data. */
    };
} WINDIVERT_ADDRESS;

Note the union. For the traditional NETWORK layers, only the Network member is valid. For the new FLOW layer, only the Flow member is valid.

Also note the new Event member. For the NETWORK layer there is only one event (new packet). But for the FLOW layer there are two events:

FLOW_ESTABLISHED: a new flow is created.
FLOW_DELETED: an old flow is deleted.

The FLOW_DATA structure contains useful information about the current flow.

typedef struct
{
    UINT32 ProcessId;                   /* Process ID. */
    UINT32 LocalAddr[4];                /* Local address. */
    UINT32 RemoteAddr[4];               /* Remote address. */
    UINT16 LocalPort;                   /* Local port. */
    UINT16 RemotePort;                  /* Remote port.  */
    UINT8  Protocol;                    /* Protocol. */
} WINDIVERT_FLOW_DATA;

Note that the ProcessId is available at this layer. The other fields form the network 5-tuple associated with the flow.

Although the FLOW layer is restricted (SNIFF-only), it is possible to filter packets matching the same 5-tuple at the NETWORK layer. This means it is possible for WinDivert applications to filter based on process ID, although it requires a bit of coordination. Also, developers should be aware of a possible race condition between WinDivertRecv and the process ID. However, perhaps this can be mitigated by testing if the corresponding process was alive at Timestamp.

The branch includes a new flowtrack sample program that shows off the new FLOW layer. See the source code here: https://github.com/basil00/Divert/blob/flow_layer/examples/flowtrack/flowtrack.c

The flowtrack program provides a "top"-like interface showing all open network connections, and the corresponding program responsible for the connection. It is interesting to see how many Windows programs regularly "phone home".

I will investigate whether other ALE layers can be supported by WinDivert.

TechnikEmpire commented 6 years ago

IIRC there is a layer that gives the call-out driver the ability to grant or deny port binding that also includes the process id. So if there was access to that, then users can gain the feature of getting the process behind packets without separate kernel calls, AND gain a new feature to control those network stacks by permit/deny responses.

Then again I haven't had my morning coffee and it's been a bit since I saw the example code I think I'm recalling this from.

basil00 commented 6 years ago

IIRC there is a layer that gives the call-out driver the ability to grant or deny port binding that also includes the process id.

I think these might be the AUTH_* layers which was something I was going to look at next. Probably will be made into a WinDivert SOCKET layer, since the events correspond to the standard socket operations bind/listen/accept.

Like FLOW, these layers do not fit into WinDivert's block-and-reinject model, so even this layer might need to be restricted. Specifically, there is no concept of "reinjection" like the NETWORK layer, although blocking decisions can be deferred by FwpsPendOperation0. However, pending saves state I think, so not a good idea to wait on a possibly unreliable user application. That said, I think DROP mode could be supported.

There is a whole bunch of ALE layers, so maybe there are other things of interest: https://docs.microsoft.com/en-us/windows/desktop/fwp/ale-layers

The REDIRECT layers also sound useful, but need to think of how it could fit into the current API.

TechnikEmpire commented 6 years ago

Yeah, when I was looking at those layers I didn't see how you could logically fit that into the existing API. My immediate reaction was to have it as a separate "layer" to WinDivert, where you can subscribe to a callback from the driver and give a yay-nay response in an OUT parameter as the user, with a default in case the user is unreliable for any reason. However I admit, I lent all of 10 minutes of thought to the concept.

I just think trying to pack new things into the existing API will make it get messy fast, and frankly these things become unrelated quickly, despite being under the umbrella of low level network land.

basil00 commented 6 years ago

My immediate reaction was to have it as a separate "layer" to WinDivert, where you can subscribe to a callback from the driver and give a yay-nay response in an OUT parameter as the user

This reminds me of netfilterqueue (the closest thing to WinDivert on Linux), where packets are sent to userland for a "decision", which can be something like allow/block or "reinject" after modification. I never liked this design.

I just think trying to pack new things into the existing API will make it get messy fast

Right, I do not want to change the API (in any major way at least).

basil00 commented 6 years ago

Another feature I've been thinking about is a meta "REFLECT" layer that returns information about currently open WinDivert handles, such as the process that opened the handle. The idea is to make WinDivert usage more transparent, and to encourage antivirus/anticheat/etc software to target the actual WinDivert user program rather than WinDivert itself (see #153). The implications of such a feature need to be thought through.

basil00 commented 6 years ago

I mostly have a working REFLECT layer implementation working which I may commit to the flow branch in a week or so.

Basically, this will be a meta-layer, and uses the normal WinDivert API (WinDivertOpen, WinDivertRecv, etc.).

At the REFLECT layer, a "packet" is generated every time a process opens or closes a WinDivert handle, and this event can be read from a REFLECT handle using WinDivertRecv. The handle's processId, layer, flags and priority will be returned via the address structure, and a representation of the filter is returned in the packet buffer.

The idea is to make WinDivert usage more transparent (at least to any process with Admin privileges), and is similar to how related tools (iptables in Linux, pf for MacOSX) can display information about packet diversion hooks at the command line.

basil00 commented 6 years ago

The first cut of the REFLECT layer is done. Next is a SOCKET layer that can monitor "socket" events (listen/connect/accept) based on the underlying WFP ALE AUTH_* layers. Unlike the FLOW layer, it should be possible to block (but not inject) at this layer.

coskifu commented 6 years ago

i want a WINDIVERT_FLAG_ALLOW flags, which is when a packet match the filter, driver can direct send the packet. now, use WinDivertOpen(..., 0) is intercept and reinject, if packet is too much, the performance is very poor(user-mode and kernel mode switch, is not good ), because is i only want send it directly

basil00 commented 6 years ago

There is already the SNIFF flag which does not block the original packet.

Or do you mean something like Linux's netfilterqueue, where the kernel keeps a copy of the packet which can be "accepted" unmodified (NF_ACCEPT with data_len=0) by the user application? The problem is that the "accept" operation still requires a user-to-kernel context switch, so you don't really save anything except a packet copy operation.

I have an idea for a "batch" mode for WinDivert, which allows WinDivertSend/WinDivertRecv to operate on multiple packets at once. This should reduce kernel->user context switch overheads.

TechnikEmpire commented 6 years ago

Batch mode sounds like a good idea.

basil00 commented 6 years ago

The idea is to introduce two new flags:

WINDIVERT_FLAG_PARTIAL and
WINDIVERT_FLAG_BATCH.

The PARTIAL flag means that incomplete reads (WinDivertRecv) are allowed (which is currently the default). The BATCH flag will allow multi-packet reads and writes (see below). And no flag will be the new default which means the entire packet must be read else it will be an error. The PARTIAL and BATCH flags are incompatible.

For BATCH mode, the addr parameter to WinDivertRecv/WinDivertSend must be an array or addresses, probably of some fixed max size, say 64 or so. The packet buffer parameter will be packed with multiple packets. WinDivertRecv will attempt to fill up the buffer with as many packets that are available and fit the buffer. Likewise, WinDivertSend will inject all the packets in the buffer.

On paper, this should save a lot of overhead. But it is hard to say how much until a prototype can be tested.

basil00 commented 6 years ago

Some other ideas:

A new WinDivertShutdown() core API function analogous to the shutdown() socket function. This will stop WinDivert queuing packets in anticipation of a call to WinDivertClose().
Switch to using big-endian for IPv6 addresses, which is standard. For consistency, also use big-endian for IPv4 addresses.
Const-correctness.

coskifu commented 6 years ago

is it has kernel firewall plans? or plan to add WINDIVERT_FLAG_PASS flag, which packet match this flag rule, it is directly send over adapter, other lower priority rule do not intercept it(like winpkfilter). now, i implement user mode firewall, i use "true" filter language, intercept all packet, and match my rule, matched allow, do not matched deny, when use send file use windows share, i found cpu usage is very highly, and packet delay decrement, and network speed slow down.

TechnikEmpire commented 6 years ago

You do realize that you can use plain user-mode windows filtering platform to do basic allow/deny filtering, right? No driver necessary.

This is how the windows firewall works underneath. You can create a new filter, block applications or ports or both, install and uninstall filters (which are just rules) all from user land without drivers.

You could probably even combine those with WinDivert. Also I'm automatically suspicious any time someone says "high CPU". I write performance critical code all the time. The most subtle things can cost you big time. Try running the performance profiler in visual studio and prove that your bottleneck is at the WinDivert functions. My bet is that it's not.

coskifu commented 6 years ago

windows firewall can not used in my job, because user can close it, and we can close it too.
WinDivert is pull packet from kernel-to-user, and push it user-to-kernel, this procedure cost time, just run passthru (args: true 4) demo, and send some files over Windows Share, you can found it

TechnikEmpire commented 6 years ago

You misread what I said. It's the engine that windows firewall uses. That's quite different. https://docs.microsoft.com/en-us/windows-hardware/drivers/network/wfp-user-mode-management-functions

Also I trust performance profilers not anecdotal claims. It's quite easy. Debug->Performance Profiler. Tick off CPU cycles and away you go. Make sure Just My Code is disabled.

TechnikEmpire commented 6 years ago

If you're looking to drop or permit certain applications, do so using the user mode API. Then open your WinDivert handle at a lower priority. Problem solved.

TechnikEmpire commented 6 years ago

Should put WinDivert itself through the performance profiler actually and see if there's any optimizations that can be made.

basil00 commented 6 years ago

is it has kernel firewall plans? or plan to add WINDIVERT_FLAG_PASS flag, which packet match this flag rule, it is directly send over adapter, other lower priority rule do not intercept it(like winpkfilter).

This is currently not possible, since WinDivert operates at the network layer, and must use the network-layer WFP injection functions. There is also a WFP ethernet layer, but this is only available in Windows 8 and beyond, and WinDivert still needs to support Windows 7.

As for performance, the main discussion is covered by issue #52. In short, WinDivert seems to be OK for up for 1Gbps speeds, but the CPU usage will be high. I think batch mode should help reduce overheads further.

coskifu commented 6 years ago

You misread what I said. It's the engine that windows firewall uses. That's quite different. https://docs.microsoft.com/en-us/windows-hardware/drivers/network/wfp-user-mode-management-functions

Also I trust performance profilers not anecdotal claims. It's quite easy. Debug->Performance Profiler. Tick off CPU cycles and away you go. Make sure Just My Code is disabled.

The WFP is just what i need, thank you very mush

basil00 commented 6 years ago

I've added support for batched receives/sends. There is no special flag, just pass an array of WINDIVERT_ADDRESS to WinDivertRecvEx and WinDivertSendEx, e.g.:

WINDIVERT_ADDRESS addr[10];
UINT addrLen = sizeof(addr);
WinDivertRecvEx(handle, packet, sizeof(packet), &packetLen, 0, addr, &addrLen, NULL);
// The packet buffer will contain up to 10 packets.
// The addrLen value is updated to reflect the actual number of packets.
...
// Modify
...
WinDivertSendEx(handle, packet, packetLen, NULL, 0, addr, addrLen, NULL);

This reduces the number of context switches, and seems to result in a noticeable reduction in CPU usage (although I have not done formal testing yet).

basil00 commented 6 years ago

The WinDivert filter language has been extended with the following terms:

packet[idx]: the idxth 8bit packet value.
packet16[idx]: the idxth 16bit packet value.
packet32[idx]: the idxth 32bit packet value.
tcp.Payload[idx] or udp.Payload[idx]: the idxth 8bit TCP/UDP payload value.
tcp.Payload16[idx] or udp.Payload16[idx]: the idxth 16bit TCP/UDP payload value.
tcp.Payload32[idx] or udp.Payload32[idx]: the idxth 32bit TCP/UDP payload value.

This allows WinDivert filter strings to match the contents of the packets in addition to fixed header fields. Different indexing modes are supported:

An undecorated integer, in which case the packet or payload is treated as a 8/16/32bit value array, similar to C arrays.
A b decorated integer, (e.g, packet32[17b]), in which case the integer is interpreted as a byte offset.
A negative (un)decorated integer, in which case indexing begins at the end of the packet or payload. E.g., packet32[-1] is the last 32bits of the packet.

Byte indexing is useful for matching unaligned words. Negative indexing is useful for protocols that pack fields at the end of packets, e.g., udp.Payload16[-1] == 0x0001 && udp.Payload16[-2] == 0x0001 matches the DNS query type and class.

basil00 commented 6 years ago

I've added the WinDivertShutdown() core API function discussed above. Most of the main features for version 2.0 are now implemented. Some other additional ideas are:

UNBIND event for the SOCKET layer. This is the opposite of the BIND event, and may help the user application manage resources.
Random matching. For example, introduce a new random32 filter language variable. This will allow traffic sampling, e.g., random32 < 0x7FFFFFF matches half of all packets.

basil00 commented 5 years ago

Most of the main 2.0 features have been implemented. the focus will now be testing and optimization, as well as a few tweaks. Hopefully there will be a release in Q1 2019, depending on driver signing.

Regarding some previous comments:

IPv6 network byte ordering: Although it is traditional to always use network byte ordering for IPv6 addresses, WinDivert instead uses "host" byte ordering. I think changing this is too much trouble, since it makes the interface inconsistent, and host byte ordering is slightly more efficient for filter evaluation. There is also a new WinDivertHelperNtohIpv6Address() helper function to convert between the two orderings.
UNBIND socket event: Not sure if it fits since this event cannot be "blocked", unlike the other socket events. It could still be added, but means different events have different semantics, which is not ideal. If not added, the flow DELETED event may be used to indicate that a connection no longer exists.

One final feature I may implement:

Internal Injection: Handle packet injection internally if there is more than one open WinDivert handle. This bypasses WFP injection which introduces quite a bit of overhead. This is an optimization and will not affect the API.

basil00 commented 5 years ago

The main features for 2.0 are now pretty much frozen. See the CHANGELOG here.

The "internal injection" idea turns out to be complicated to implement, so will probably be delayed until the next major update. The main outstanding TODO for 2.0 is to update the WinDivert documentation, which is a lot of work given the number of new features...

chris1201 commented 5 years ago

Could you add the stream layer(FWPM_LAYER_STREAM_V*) to capture TCP payload directly ?

basil00 commented 5 years ago

Could you add the stream layer(FWPM_LAYER_STREAM_V*) to capture TCP payload directly ?

I've seriously considered adding this, but it will not be part of the 2.0 release.

To do the stream layers "properly" would also probably require a stream-like WinDivert API to be implemented, similar to SOCK_STREAM sockets, so is quite a radical change. Even then, I am not sure if there are technical limitations that will cause problems. For example, I previously considered adding the TRANSPORT layers, but these layers are not stateless meaning that they do not really fit with the WinDivert model (basically, you'd need to somehow thread some arbitrary-sized state to the user application and back). The alternative is to make these layers SNIFF-only, but then they are not as useful.

darrennong commented 5 years ago

Redirect connections to a local process：https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/fwpsk/nf-fwpsk-fwpsredirecthandlecreate0

basil00 commented 5 years ago

Redirect connections to a local process：https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/fwpsk/nf-fwpsk-fwpsredirecthandlecreate0

I've also looked into that too. One problem is that it requires Windows 8 or above, and WinDivert still officially supports Windows 7 until the Microsoft end-of-life (January 14, 2020). So maybe it will be included in WinDivert 3.0.

gpotter2 commented 5 years ago

Hello ! I might fall off-topic but is there a place to ask more general questions (which might not be appropriate for a bug report) ? Thanks !

basil00 commented 5 years ago

@gpotter2 One thing you can do is post to stackoverflow and email me a link to the question.

basil00 commented 5 years ago

I've reconsidered and added UNBIND/DISCONNECT socket events, since these are quite useful for some applications. These events can only be "sniffed" and not blocked.

gpotter2 commented 5 years ago

@basil00 Seems I can't email you anymore :/

      'reqrypt' le 28/02/2019 15:54
            Erreur serveur : '454 4.7.1 <basil@reqrypt.org>: Relay access denied'

But thanks a lot for your work ! Feel free to ping me whenever L2 is available, so that I implement it in Scapy

basil00 commented 5 years ago

@gpotter2 That's concerning, perhaps try again and see if the problem has resolved?

It is also possible to contact me via reddit private message.

As for layer 2 support, it will likely be added in WinDivert 3.0 after Windows 7 support is deprecated. Hopefully in the second half of 2019.

basil00 commented 5 years ago

A build of WinDivert-2.0.0 has been made available for testing:

WinDivert-2.0.0-rc.zip

Note the drivers are not signed, so if you want to try it out, then please following the test signing procedure explained here.

chris1201 commented 5 years ago

Could you add the stream layer(FWPM_LAYER_STREAM_V*) to capture TCP payload directly ?

I've seriously considered adding this, but it will not be part of the 2.0 release.

To do the stream layers "properly" would also probably require a stream-like WinDivert API to be implemented, similar to SOCK_STREAM sockets, so is quite a radical change. Even then, I am not sure if there are technical limitations that will cause problems. For example, I previously considered adding the TRANSPORT layers, but these layers are not stateless meaning that they do not really fit with the WinDivert model (basically, you'd need to somehow thread some arbitrary-sized state to the user application and back). The alternative is to make these layers SNIFF-only, but then they are not as useful.

In fact, I really want this feature very much, even if it is just sniff mode, it will make TCP data analysis faster and easier.

basil00 commented 5 years ago

@chris1201 I also think it will be useful. However, the features for 2.0 release are more-or-less fixed, so it will have to wait until the next version.

basil00 commented 5 years ago

A package with signed drivers is now available:

WinDivert-2.0.0-rc-signed.zip

The official release will probably be in a week or so.

Thanks for eveyone's comments and suggestions.

basil00 commented 5 years ago

WinDivert version 2.0.0-rc has been released: https://github.com/basil00/Divert/releases/tag/v2.0.0-rc

The website will be updated in a week or so.

gpotter2 commented 4 years ago

Hi, merry Christmas!

With the new year coming soon comes the drop of Win7. If that's still relevant,

As for layer 2 support, it will likely be added in WinDivert 3.0 after Windows 7 support is deprecated.

Any news / plans ? :smile:

Thanks a lot for your great work on Windivert

basil00 commented 4 years ago

Any news / plans ?

You are in luck, I have been working on this. I've created a branch with the current progress: https://github.com/basil00/Divert/tree/eth_layer

There is still a lot do to (testing/debugging) before it is ready however, so may take several more months. I will create a new WinDivert 3.0 wishlist soon.

gpotter2 commented 4 years ago

That's awesome thanks so much.

basil00 / WinDivert

WinDivert 2.0 Wishlist #156

Flow Layer