Open yonghong-song opened 9 years ago
The following diff fixed the problem:
plumgrid@yhs-plumgrid:~/iovisor/bcc/tests/jit$ git diff
diff --git a/tests/jit/bpfdev1.b b/tests/jit/bpfdev1.b
index c2ecddd..bc04002 100644
--- a/tests/jit/bpfdev1.b
+++ b/tests/jit/bpfdev1.b
@@ -110,49 +110,53 @@ u32 main(struct proto::skbuff *skb) {
u32 old_ip:32;
u64 src_mac:48;
u64 dst_mac:48;
+ u32 bpfdev_ip:32;
+ u32 slave_ip:32;
struct ConfigKey cfg_key = {.index = 0};
struct ConfigLeaf *cfg_leaf;
config_map.lookup(cfg_key, cfg_leaf) {};
on_valid(cfg_leaf) {
- struct MacaddrKey mac_key = {.ip = cfg_leaf->bpfdev_ip};
- struct MacaddrLeaf *mac_leaf;
-
- mac_key.ip = cfg_leaf->bpfdev_ip;
- macaddr_map.lookup(mac_key, mac_leaf) {};
- on_valid (mac_leaf) {
- src_mac = mac_leaf->mac;
- } else {
- goto EOP;
- }
-
- mac_key.ip = cfg_leaf->slave_ip;
- macaddr_map.lookup(mac_key, mac_leaf) {};
- on_valid (mac_leaf) {
- dst_mac = mac_leaf->mac;
- } else {
- goto EOP;
- }
-
- // rewrite ethernet header
- pkt.rewrite_field($ethernet.dst, dst_mac);
- pkt.rewrite_field($ethernet.src, src_mac);
-
- // ip & udp checksum
- incr_cksum(@ip.hchecksum, $ip.src, cfg_leaf->bpfdev_ip);
- incr_cksum(@ip.hchecksum, $ip.dst, cfg_leaf->slave_ip);
- incr_cksum(@udp.crc, $ip.src, cfg_leaf->bpfdev_ip, 1);
- incr_cksum(@udp.crc, $ip.dst, cfg_leaf->slave_ip, 1);
-
- // rewrite ip src/dst fields
- pkt.rewrite_field($ip.src, cfg_leaf->bpfdev_ip);
- pkt.rewrite_field($ip.dst, cfg_leaf->slave_ip);
+ bpfdev_ip = cfg_leaf->bpfdev_ip;
+ slave_ip = cfg_leaf->slave_ip;
+ } else {
+ goto EOP;
+ }
+ struct MacaddrKey mac_key = {.ip = bpfdev_ip};
+ struct MacaddrLeaf *mac_leaf;
+
+ mac_key.ip = bpfdev_ip;
+ macaddr_map.lookup(mac_key, mac_leaf) {};
+ on_valid (mac_leaf) {
+ src_mac = mac_leaf->mac;
+ } else {
goto EOP;
+ }
+ mac_key.ip = slave_ip;
+ macaddr_map.lookup(mac_key, mac_leaf) {};
+ on_valid (mac_leaf) {
+ dst_mac = mac_leaf->mac;
} else {
goto EOP;
}
+
+ // rewrite ethernet header
+ pkt.rewrite_field($ethernet.dst, dst_mac);
+ pkt.rewrite_field($ethernet.src, src_mac);
+
+ // ip & udp checksum
+ incr_cksum(@ip.hchecksum, $ip.src, bpfdev_ip);
+ incr_cksum(@ip.hchecksum, $ip.dst, slave_ip);
+ incr_cksum(@udp.crc, $ip.src, bpfdev_ip, 1);
+ incr_cksum(@udp.crc, $ip.dst, slave_ip, 1);
+
+ // rewrite ip src/dst fields
+ pkt.rewrite_field($ip.src, bpfdev_ip);
+ pkt.rewrite_field($ip.dst, slave_ip);
+
+ goto EOP;
}
}
plumgrid@yhs-plumgrid:~/iovisor/bcc/tests/jit$
Basically, the map result should have a short live range by copying to local variable.
the diff is mangled it seems. It seems it's a workaround by using inside knowledge of front-end gen ? I think we should try to fix it in the backend. Clearly r2=r1 assignments are redundant. I also don't 'remat' pass to be run. It's important. Otherwise constant values will be copied between registers instead of using 'mov Rx, imm' which is faster, less register pressure and easier on verifier.
I've updated the diff with proper github markdown syntax. Alexei, do you have any advice on how to fix it in the backend, or are you considering that your own AI?
I don't have a concrete plan. I suspect there is still something we missing about pass_manager. Probably need to enable several more passes.
On Mon, May 11, 2015 at 6:34 AM, Brenden notifications@github.com wrote:
I've updated the diff with proper github markdown syntax. Alexei, do you have any advice on how to fix it in the backend, or are you considering that your own AI?
— Reply to this email directly or view it on GitHub https://github.com/plumgrid/bcc/issues/10#issuecomment-100908929.
For the following B program:
plumgrid@yhs-plumgrid:~/iovisor/bcc/tests/jit$
The control plane app looks like:
plumgrid@yhs-plumgrid:~/iovisor/bcc/tests/jit$ cat bpfdev1.py
plumgrid@yhs-plumgrid:~/iovisor/bcc/tests/jit$
In order to run complete test, there are other changes in bcc are needed to support new bpfdev device and these changes are not included here.
The test failed with the following symtom:
instruction "96" does an assignment from r0 to r1, and verifier thinks r1 could be map_value or null, although instruction "97" checks "r0" for null.
The compiler ought to generate better code. The instruction "r1 = r0" is not necessary.
I dumped IR (change py program debug=0 to debug=1), and feed the IR to llc. `llc -march=bpf -filetype=asm -O3 b.ll``
llc also generates similar code:
-O2 generates similar code.
Studying the LLVM optimization passes, there is a path in LLVM which called "virtual register rewrite" and it indeed removes SOME of the above redundant copies, but not all of them, hence causing the issue.
FYI, I changed LLVM to print out the pass applied during bcc compiler optimization and below is the result:
Checking llc compiler passes, it is very similar (I did not compare one-to-one) to the above for function passes.
In summary, this is an LLVM issue and we may have to fix there.