tumi8 / vermont

Vermont (VERsatile MONitoring Toolkit) is an open-source software toolkit for the creation and processing of network flow data.
https://www.net.in.tum.de/research/software/#vermont
GNU General Public License v2.0
56 stars 22 forks source link

crash in aggregator with flow data containing fields of type `IPFIX_TYPE_string` or `IPFIX_TYPE_octetArray` #108

Open nickbroon opened 6 years ago

nickbroon commented 6 years ago

Using the following very basic config containing an aggregator, and the attached basic ipfix dump of flows that contain the "applicationId (95)" field, vermont will crash, with the below backtrace in the aggregator hashtable. Anyone familiar with aggregator/hashtable code able to easily diagnose the cause of the crash? If the flow capture does not contain the applicationId (95) the aggregator does not crash, so perhaps it related to 'octetArray' field support, or that in this case the field has a length of 5?

./vermont -l debug -f ./vermont.xml >& vermont.log
*** Error in `./vermont': malloc(): memory corruption: 0x00007f240c003670 ***
Aborted (core dumped)
Thread 4 "IpfixReceiver" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff0bca700 (LWP 1030)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5d09f5d in __GI_abort () at abort.c:90
#2  0x00007ffff5d5228d in __libc_message (action=action@entry=do_abort, 
    fmt=fmt@entry=0x7ffff5e79528 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff5d5964a in malloc_printerr (action=<optimised out>, str=0x7ffff5e75ef0 "malloc(): memory corruption", 
    ptr=<optimised out>, ar_ptr=<optimised out>) at malloc.c:5426
#4  0x00007ffff5d5d42d in _int_malloc (av=av@entry=0x7fffe8000020, bytes=bytes@entry=48) at malloc.c:3741
#5  0x00007ffff5d5ff3e in __GI___libc_malloc (bytes=48) at malloc.c:3086
#6  0x00007ffff66ae458 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00005555557b80c1 in InstanceManager<BucketListElement>::getNewInstance (this=0x555555b9de90)
    at /home/user/vermont/src/core/InstanceManager.h:95
#8  0x00005555557bc318 in FlowHashtable::bufferDataBlock (this=0x555555b9daa0, data=...)
    at /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:542
#9  0x00005555557bd7ff in FlowHashtable::aggregateDataRecord (this=0x555555b9daa0, record=0x7fffe8013550)
    at /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:749
#10 0x0000555555744bab in IpfixAggregator::onDataRecord (this=0x555555b9d910, record=0x7fffe8013550)
    at /home/user/vermont/src/modules/ipfix/aggregator/IpfixAggregator.cpp:76
#11 0x0000555555718810 in IpfixRecordDestination::receive (this=0x555555b9da88, ipfixRecord=0x7fffe8013550)
    at /home/user/vermont/src/modules/ipfix/IpfixRecordDestination.cpp:40
#12 0x000055555569f20b in Source<IpfixRecord*>::send (this=0x555555b9d740, t=0x7fffe8013550)
    at /home/user/vermont/src/core/Source.h:98
#13 0x000055555571179a in IpfixCollector::send (this=0x555555b9d6e0, ipfixRecord=0x7fffe8013550)
    at /home/user/vermont/src/modules/ipfix/IpfixCollector.cpp:102
#14 0x000055555579d5b0 in IpfixParser::push (this=0x555555b9d820, ipfixRecord=0x7fffe8013550)
    at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:771
#15 0x00005555557998c3 in IpfixParser::processDataSet (this=0x555555b9d820, sourceId=..., message=..., 
    set=0x7fffe8013c00, endOfMessage=0x7fffe8013c64 "")
    at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:352
#16 0x000055555579bce1 in IpfixParser::processIpfixPacket (this=0x555555b9d820, message=..., length=116, sourceId=...)
    at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:608
#17 0x000055555579c967 in IpfixParser::processPacket (this=0x555555b9d820, message=..., length=116, sourceId=...)
    at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:676
#18 0x0000555555796e0d in IpfixReceiverFile::run (this=0x555555b942e0)
    at /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:287
#19 0x0000555555708461 in IpfixReceiver::threadWrapper (ipfixReceiver_=0x555555b942e0)
    at /home/user/vermont/src/modules/ipfix/IpfixReceiver.cpp:170
#20 0x00007ffff7bbd7fc in start_thread (arg=0x7ffff0bca700) at pthread_create.c:465
#21 0x00007ffff5de5b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
<ipfixConfig>

<ipfixReceiverFile id="1">
  <next>2</next>
</ipfixReceiverFile>

<ipfixAggregator id="2">
<!-- Aggregator agg1 -->
        <rule>
                <flowKey>
                       <ieName>destinationIPv4Address</ieName>
                       <autoAddV4PrefixLength>false</autoAddV4PrefixLength>
                </flowKey>
                <flowKey>
                       <ieName>sourceIPv4Address</ieName>
                       <autoAddV4PrefixLength>false</autoAddV4PrefixLength>
                </flowKey>
                <nonFlowKey>
                       <ieName>applicationId</ieName>
                </nonFlowKey>
        </rule>
        <expiration>
               <inactiveTimeout unit="sec">5</inactiveTimeout>
               <activeTimeout unit="sec">30</activeTimeout>
        </expiration>
        <next>3</next><!-- Exporter exp1 -->
</ipfixAggregator>

<ipfixExporter id="3">
<!-- Exporter exp1 -->
        <protocolVersion>NFV9</protocolVersion>
        <maxRecordRate>0</maxRecordRate>
        <collector>
                <transportProtocol>UDP</transportProtocol>
                <ipAddress>10.10.10.1</ipAddress>
                <port>2055</port>
        </collector>
</ipfixExporter>

</ipfixConfig>
17:23:47.651[0] WARNING /home/user/vermont/src/vermont.cc:340: starting up vermont config manager
17:23:47.651[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixReceiverFileCfg.cpp:71: CollectorConfiguration: Successfully parsed collectingProcess section
17:23:47.651[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixExporterCfg.cpp:40: Exporter: using maximum rate of 0 records/second
17:23:47.651[0] NOTICE /home/user/vermont/src/core/Connector.cpp:51: Connecting module ipfixReceiverFile[Id = 1] -> ipfixAggregator[Id = 2]
17:23:47.651[0]   INFO /home/user/vermont/src/core/Connector.cpp:54: connecting instances
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:128: IpfixReceiverFile initialized with the following parameters:
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:129:   - packet_file_directory = ./
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:130:   - packet_file_basename = ipfix.dump
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:131:   - Start (from) = 0
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:132:   - End (to) = 0
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:133:   - ignoreTimestamps = true
17:23:47.652[0]   INFO /home/user/vermont/src/modules/ipfix/IpfixRecord.cpp:167: received unknown field type applicationId (id=95, length=65535)
17:23:47.652[0]   INFO /home/user/vermont/src/modules/ipfix/aggregator/Rule.cpp:82: valid protocols for this template: 07
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/aggregator/BaseHashtable.cpp:59: Hashtable initialized with following parameters:
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/aggregator/BaseHashtable.cpp:60:   - inactiveTimeout=5
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/aggregator/BaseHashtable.cpp:61:   - activeTimeout=30
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/aggregator/BaseHashtable.cpp:62:   - htableBits=17
17:23:47.652[0]   INFO /home/user/vermont/src/modules/ipfix/IpfixRecord.cpp:279: TemplateInfo: need to increase number of uniqueIds, oldSize=0
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/aggregator/BaseAggregator.cpp:135: Done. Parsed 1 rules; inactiveTimeout 5, activeTimeout 30
17:23:47.652[0] NOTICE /home/user/vermont/src/core/Connector.cpp:51: Connecting module ipfixAggregator[Id = 2] -> ipfixExporter[Id = 3]
17:23:47.652[0]   INFO /home/user/vermont/src/core/Connector.cpp:54: connecting instances
17:23:47.652[0]   INFO /home/user/vermont/src/modules/ipfix/IpfixSender.cpp:114: IpfixSender: running
17:23:47.652[0]   INFO /home/user/vermont/src/modules/ipfix/IpfixExporterCfg.cpp:128: IpfixExporter: adding collector UDP://10.10.10.1:2055
17:23:47.652[0] NOTICE /home/user/vermont/src/modules/ipfix/IpfixSender.cpp:191: IpfixSender: adding UDP://10.10.10.1:2055 to exporter
17:23:47.652[0] CRITICAL /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:234: getsockopt(...,IP_MTU,...) failed, Transport endpoint is not connected
17:23:47.652[0]   INFO /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:593: get_mtu() returned -1
17:23:47.652[0]   INFO /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:879: Removing collector.
17:23:47.652[0]   INFO /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:893: Closing data socket
17:23:47.652[0] CRITICAL /home/user/vermont/src/modules/ipfix/IpfixSender.cpp:230: IpfixSender: ipfix_add_collector of 10.10.10.1:2055 to exporter
17:23:47.652[0] WARNING /home/user/vermont/src/core/Cfg.h:265: queue is required by module id=3 but is not configured. Inserting a default queue with max size 1 (attention: this is inefficient!)
17:23:47.652[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:260: NodeID=0, Modulename=ipfixReceiverFile
17:23:47.652[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:228: called (0)
17:23:47.652[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:235: module 0 -> module 1
17:23:47.652[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:228: called (1)
17:23:47.652[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:235: module 1 -> module 2
17:23:47.652[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:228: called (2)
17:23:47.652[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:243: postI[0] = 2
17:23:47.653[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:243: postI[1] = 1
17:23:47.653[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:243: postI[2] = 0
17:23:47.653[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:260: NodeID=1, Modulename=ipfixAggregator
17:23:47.653[0]  DEBUG /home/user/vermont/src/core/Graph.cpp:260: NodeID=2, Modulename=ipfixExporter
17:23:47.653[0]   INFO /home/user/vermont/src/core/Graph.cpp:271: topological sort #0: ipfixReceiverFile[1]
17:23:47.653[0]   INFO /home/user/vermont/src/core/Graph.cpp:271: topological sort #1: ipfixAggregator[2]
17:23:47.653[0]   INFO /home/user/vermont/src/core/Graph.cpp:271: topological sort #2: ipfixExporter[3]
17:23:47.653[0] NOTICE /home/user/vermont/src/modules/ConfigManager.cpp:201: Starting module ipfixExporter
17:23:47.653[0]   INFO /home/user/vermont/src/common/TimeoutSemaphore.h:262: restarting
17:23:47.653[0]   INFO /home/user/vermont/src/common/TimeoutSemaphore.h:262: restarting
17:23:47.653[0]   INFO /home/user/vermont/src/common/Thread.h:44: creating new thread: ConnectionQueue
17:23:47.653[0] NOTICE /home/user/vermont/src/modules/ConfigManager.cpp:201: Starting module ipfixAggregator
17:23:47.653[0]   INFO /home/user/vermont/src/modules/ipfix/aggregator/BaseAggregator.cpp:65: called
17:23:47.653[0]   INFO /home/user/vermont/src/core/InstanceManager.h:103: adding used instance 0x5596a8ea6790
17:23:47.653[1]   INFO /home/user/vermont/src/core/ConnectionQueue.h:262: starting thread
17:23:47.653[0]   INFO /home/user/vermont/src/core/ConnectionQueue.h:54: receive(Packet*)
17:23:47.653[0]  DEBUG /home/user/vermont/src/common/ConcurrentQueue.h:68: () trying to push element (0 elements in queue)
17:23:47.653[0]  DEBUG /home/user/vermont/src/common/ConcurrentQueue.h:91: () element pushed (1 elements in queue)
17:23:47.653[0]   INFO /home/user/vermont/src/common/Thread.h:44: creating new thread: BaseAggregator
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ConcurrentQueue.h:37: () element popped
17:23:47.653[1]   INFO /home/user/vermont/src/modules/ipfix/IpfixSender.cpp:269: IpfixSender: Template received (setid=2, id=0)
17:23:47.653[0] NOTICE /home/user/vermont/src/modules/ConfigManager.cpp:201: Starting module ipfixReceiverFile
17:23:47.653[1]   INFO /home/user/vermont/src/modules/ipfix/IpfixSender.cpp:304: IpfixSender: Use Template ID 256 instead of 0.
17:23:47.653[0]   INFO /home/user/vermont/src/common/Thread.h:44: creating new thread: IpfixReceiver
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:956: ipfix_find_template with ID: 256
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:972: ipfix_get_free_template_slot
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:977: ipfix_get_free_template_slot found at 0
17:23:47.653[2] NOTICE /home/user/vermont/src/modules/ipfix/aggregator/BaseAggregator.cpp:157: Polling aggregator each 0 msec
17:23:47.653[0] WARNING /home/user/vermont/src/vermont.cc:347: vermont is up and running
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:956: ipfix_find_template with ID: 256
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:964: ipfix_find_template with ID: 256, validity 1 found at 0
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:174: IpfixReceiverFile: Trying to read message from file "./ipfix.dump0000000000"
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2612: template found at 0
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2613: A p_pos 0x7f241c0015f0, p_end 0x7f241c001610
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2614: max_fields_length 32
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2615: fields_length 8
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2620: B p_pos 0x7f241c0015f8, p_end 0x7f241c001610
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:286: Data block starts with: 0 a 0 7c
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:956: ipfix_find_template with ID: 256
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/IpfixRecord.cpp:279: TemplateInfo: need to increase number of uniqueIds, oldSize=1
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:964: ipfix_find_template with ID: 256, validity 1 found at 0
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/TemplateBuffer.cpp:138: Destroy template - no matching template found (id=256)
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2612: template found at 0
17:23:47.653[3]   INFO /home/user/vermont/src/core/InstanceManager.h:103: adding used instance 0x7f240c013a60
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2613: A p_pos 0x7f241c0015f0, p_end 0x7f241c001610
17:23:47.653[3]   INFO /home/user/vermont/src/core/InstanceManager.h:151: removing used instance 0x7f240c013a60
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2614: max_fields_length 32
17:23:47.653[3]  DEBUG /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:646: IPFIX message from 1.0.0.127:0 ( contained 0 Data Records and 1 Template Records. Sequence number was 0.
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2615: fields_length 12
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:286: Data block starts with: 0 a 0 74
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2620: B p_pos 0x7f241c0015fc, p_end 0x7f241c001610
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/TemplateBuffer.cpp:43: ALL TEMPLATES ---------------------------
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:956: ipfix_find_template with ID: 256
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/TemplateBuffer.cpp:45: bt->sourceID odid 0 exporter port 0 collector port 0 exporter ip 1.0.0.127 len 4 prot 0 ptr: 0x7f240c0011a0 size: 40 expires in -1520875427 sec
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:964: ipfix_find_template with ID: 256, validity 1 found at 0
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/TemplateBuffer.cpp:49: END ALL TEMPLATES --------------------------
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2612: template found at 0
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/TemplateBuffer.cpp:52: Searching for : sourceID 0 0 0 1.0.0.127  4 0
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2613: A p_pos 0x7f241c0015f0, p_end 0x7f241c001610
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/TemplateBuffer.cpp:62: Template found.
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2614: max_fields_length 32
17:23:47.653[3]   INFO /home/user/vermont/src/core/InstanceManager.h:103: adding used instance 0x7f240c013550
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2615: fields_length 16
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/aggregator/IpfixAggregator.cpp:57: Got a Data Record

17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:2620: B p_pos 0x7f241c001600, p_end 0x7f241c001610
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/aggregator/IpfixAggregator.cpp:75: rule 0 matches

17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:956: ipfix_find_template with ID: 256
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:672: called
17:23:47.653[1]  DEBUG /home/user/vermont/src/common/ipfixlolib/ipfixlolib.c:964: ipfix_find_template with ID: 256, validity 1 found at 0
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:724: Tried to set mask of length 4 IP address
17:23:47.653[1]   INFO /home/user/vermont/src/modules/ipfix/IpfixSender.cpp:358: IpfixSender: created template with ID 256
17:23:47.653[3]   INFO /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:709: Tried to set mask of length 4 IP address

17:23:47.653[1]   INFO /home/user/vermont/src/core/InstanceManager.h:151: removing used instance 0x5596a8ea6790
17:23:47.654[3]  DEBUG /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:459: nhash=12553
17:23:47.654[3]  DEBUG /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:536: creating new bucket

vermont.xml.gz ipfix.dump0000000000.gz vermont.log.gz

nickbroon commented 6 years ago

Compiling with the following small diff to disable the InstanceManager it still crashes, just not in the aggregator hastable IM anymore. This time in deep in IpfixReceiverFile() buffer management code. Anyone able to offer any clues?

diff --git a/src/core/InstanceManager.h b/src/core/InstanceManager.h
index b0788ef14f9b..75b6df6a19c9 100644
--- a/src/core/InstanceManager.h
+++ b/src/core/InstanceManager.h
@@ -30,6 +30,8 @@
 #include <list>
 #include <algorithm>

+#define IM_DISABLE 1
+
 using namespace std;

 /**

./vermont -l debug -f ./vermont.xml

11:24:47.103[3]   INFO /home/brownn/work/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:709: Tried to set mask of length 4 IP address

11:24:47.103[3]  DEBUG /home/brownn/work/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:459: nhash=128727
11:24:47.103[3]  DEBUG /home/brownn/work/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:536: creating new bucket
11:24:47.103[3]   INFO /home/brownn/work/vermont/src/core/InstanceManager.h:158: removing used instance 0x7fffe8013550
11:24:47.103[3]  ERROR /home/brownn/work/vermont/src/modules/ipfix/IpfixParser.cpp:593: IpfixParser: Invalid set length 0, must be >= 4
*** Error in `/home/brownn/work/vermont/vermont': free(): invalid pointer: 0x00007fffe8013bf0 ***

Thread 4 "IpfixReceiver" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff0bca700 (LWP 336)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5d09f5d in __GI_abort () at abort.c:90
#2  0x00007ffff5d5228d in __libc_message (action=action@entry=do_abort, 
    fmt=fmt@entry=0x7ffff5e79528 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff5d5964a in malloc_printerr (action=<optimised out>, str=0x7ffff5e75eae "free(): invalid pointer", 
    ptr=<optimised out>, ar_ptr=<optimised out>) at malloc.c:5426
#4  0x00007ffff5d5b73e in _int_free (av=0x7ffff60abc20 <main_arena>, p=<optimised out>, have_lock=0) at malloc.c:4175
#5  0x00007ffff5d6044e in __GI___libc_free (mem=<optimised out>) at malloc.c:3145
#6  0x00005555556e2f3b in boost::checked_array_delete<unsigned char> (x=0x7fffe8013bf0 "")
    at /usr/include/boost/core/checked_delete.hpp:41
#7  0x00005555556e2d4a in boost::checked_array_deleter<unsigned char>::operator() (this=0x7fffe8023c18, 
    x=0x7fffe8013bf0 "") at /usr/include/boost/core/checked_delete.hpp:63
#8  0x00005555556e3309 in boost::detail::sp_counted_impl_pd<unsigned char*, boost::checked_array_deleter<unsigned char> >::dispose (this=0x7fffe8023c00) at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:153
#9  0x00005555556c1c4e in boost::detail::sp_counted_base::release (this=0x7fffe8023c00)
    at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
#10 0x00005555556c1ccb in boost::detail::shared_count::~shared_count (this=0x7ffff0bc99d8, __in_chrg=<optimised out>)
    at /usr/include/boost/smart_ptr/detail/shared_count.hpp:473
#11 0x00005555556e1f24 in boost::shared_array<unsigned char>::~shared_array (this=0x7ffff0bc99d0, 
    __in_chrg=<optimised out>) at /usr/include/boost/smart_ptr/shared_array.hpp:44
#12 0x000055555576b7d0 in boost::shared_array<unsigned char>::reset<unsigned char> (this=0x7ffff0bc9a40, 
    p=0x7fffe8023c30 "") at /usr/include/boost/smart_ptr/shared_array.hpp:178
#13 0x000055555576a288 in IpfixReceiverFile::run (this=0x555555b4bef0)
    at /home/brownn/work/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:208
#14 0x00005555556ea9dd in IpfixReceiver::threadWrapper (ipfixReceiver_=0x555555b4bef0)
    at /home/brownn/work/vermont/src/modules/ipfix/IpfixReceiver.cpp:170
#15 0x00007ffff7bbd7fc in start_thread (arg=0x7ffff0bca700) at pthread_create.c:465
#16 0x00007ffff5de5b5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
ogasser commented 6 years ago

@nickbroon Can you provide a pcap file for a MWE?

nickbroon commented 6 years ago

The 'ipfix.dump0000000000.gz' attachment to my first comment should provide a MWE.

ogasser commented 6 years ago

Oh, I overlooked that :) Thx!

nickbroon commented 6 years ago

With the above config (and #define IM_DISABLE 1) after processing the first 2 fields in the first dataset encounter (in the second frame, which also contains a template) on the third field applicationId it appears the FlowHashtable::copyData() is overwriting the original IpfixSetHeader (I set a watch on set->id and set->length from IpfixParser::processDataSet()), and almost certainly more of the original ipfix frame. This is clearly wrong, though I've yet to correctly determine exactly what FlowHashtable::copyData() is doing incorrectly for applicationId and what needs fixing. Any advise greatly received.

#0  0x00007ffff5d837a6 in ?? ()
#1  0x0000555555786707 in FlowHashtable::copyData (this=0x555555b553f0, dstFI=0x555555b56770, dst=0x7fffe4013b90 "\n\n\001\002\n\n\001\001", srcFI=0x7fffe4013918, src=0x7fffe4013c74 "", modifier=Rule::Field::AGGREGATE) at /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:585
#2  0x0000555555787289 in FlowHashtable::aggregateDataRecord (this=0x555555b553f0, record=0x7fffe4013550) at /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:700
#3  0x0000555555719a9b in IpfixAggregator::onDataRecord (this=0x555555b4c350, record=0x7fffe4013550) at /home/user/vermont/src/modules/ipfix/aggregator/IpfixAggregator.cpp:76
#4  0x00005555556fad8c in IpfixRecordDestination::receive (this=0x555555b4c4c8, ipfixRecord=0x7fffe4013550) at /home/user/vermont/src/modules/ipfix/IpfixRecordDestination.cpp:40
#5  0x0000555555681b27 in ConnectionSplitter<IpfixRecord*>::process (this=0x555555b4c1e0, packet=0x7fffe4013550) at /home/user/vermont/src/core/ConnectionSplitter.h:93
#6  0x000055555567d45d in ConnectionSplitter<IpfixRecord*>::receive (this=0x555555b4c1e0, packet=0x7fffe4013550) at /home/user/vermont/src/core/ConnectionSplitter.h:66
#7  0x0000555555682ad5 in Source<IpfixRecord*>::send (this=0x555555b564a0, t=0x7fffe4013550) at /home/user/vermont/src/core/Source.h:98
#8  0x00005555556f3d16 in IpfixCollector::send (this=0x555555b56440, ipfixRecord=0x7fffe4013550) at /home/user/vermont/src/modules/ipfix/IpfixCollector.cpp:102
#9  0x000055555577193c in IpfixParser::push (this=0x555555b56580, ipfixRecord=0x7fffe4013550) at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:771
#10 0x000055555576dc4f in IpfixParser::processDataSet (this=0x555555b56580, sourceId=..., message=..., set=0x7fffe4013c70, endOfMessage=0x7fffe4013cd4 "") at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:351
#11 0x000055555577006d in IpfixParser::processIpfixPacket (this=0x555555b56580, message=..., length=116, sourceId=...) at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:608
#12 0x0000555555770cf3 in IpfixParser::processPacket (this=0x555555b56580, message=..., length=116, sourceId=...) at /home/user/vermont/src/modules/ipfix/IpfixParser.cpp:676
#13 0x000055555576b199 in IpfixReceiverFile::run (this=0x555555b4bce0) at /home/user/vermont/src/modules/ipfix/IpfixReceiverFile.cpp:287
#14 0x00005555556ea9dd in IpfixReceiver::threadWrapper (ipfixReceiver_=0x555555b4bce0) at /home/user/vermont/src/modules/ipfix/IpfixReceiver.cpp:170
#15 0x00007ffff7bbd7fc in ?? ()
#16 0x0000000000000000 in ?? ()
Thread 3 "IpfixReceiver" hit Breakpoint 6, FlowHashtable::copyData (this=0x555555b553f0, dstFI=0x555555b56770, dst=0x7fffe4013b90 "\n\n\001\002\n\n\001\001\067:0 (", srcFI=0x7fffe4013918, src=0x7fffe4013c74 "\n\n\001\001\n\n\001\002\001", modifier=Rule::Field::AGGREGATE) at /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:556
556 {
(gdb) p *dstFI
$11 = (TemplateInfo::FieldInfo *) 0x555555b56770
(gdb) p *dstFI
$12 = {type = {id = 95, enterprise = 0, length = 65535}, offset = 8, privDataOffset = 0, isVariableLength = true, basicListData = {semantic = 120 'x', fieldIe = 0x0}}
(gdb) p *srcFI
$13 = {type = {id = 95, enterprise = 0, length = 5}, offset = 69, privDataOffset = 0, isVariableLength = false, basicListData = {semantic = 0 '\000', fieldIe = 0x0}}
(gdb)

So the question I guess is about copying applicationId which is id 95. It appears the source length 5 and marked as not isVariableLength while the destination has length of 65535 and marked as isVariableLength. Any suggestion as to what is going on here are welcome.

nickbroon commented 6 years ago

Strangely the fieldLength being used by the Hashtable is only 8. This probably the length of the 2 flowKey fields, destinationIPv4Address and sourceIPv4Address, but no account of length of the nonFlowKey field applicationId has been taken.

#1  0x0000555555787289 in FlowHashtable::aggregateDataRecord (this=0x555555b553f0, record=0x7fffe4013550) at /home/user/vermont/src/modules/ipfix/aggregator/FlowHashtable.cpp:700
700             copyData(hfi, htdata.get(), tfi, data, fieldModifier[i]);
(gdb) p *record
$17 = {<IpfixRecord> = {_vptr.IpfixRecord = 0x555555a88d28 <vtable for IpfixDataRecord+16>, sourceID = {px = 0x7fffe40011a0, pn = {pi_ = 0x7fffe40011d0}}, threeByteIndicator = 255 '\377', variableLenData = 0x0, variableLenDataTotalBytes = 100, variableLenDataCurrBytes = 0}, <ManagedInstance<IpfixDataRecord>> = {_vptr.ManagedInstance = 0x555555a88d58 <vtable for IpfixDataRecord+64>, myInstanceManager = 0x555555a97c40 <IpfixParser::dataRecordIM>, referenceCount = 1, deletedByManager = false}, templateInfo = {px = 0x7fffe4013610, pn = {pi_ = 0x555555b563d0}}, dataLength = 96, message = {px = 0x7fffe4013c60 "", pn = {pi_ = 0x7fffe4013ac0}}, data = 0x7fffe4013c74 "\n\n\001\001\n\n\001\002\001"}
(gdb) p fieldLength
$18 = 8

It appears that vermont treats applicationId as an octetArray, and always treats octetArrays as variable length. (ie as having a length of 65535)

In BaseHashtable::createDataTemplate() it appears fieldLength will not increased, as it's neither a basic field nor IPFIX_TYPEID_basicList as this code snippets show. (There should probably be an extra else here with a warning/error here)

if (!fi->isVariableLength) {
                fieldLength += fi->type.length;
            }
            // Variable length fields: Extract real length information
            else if (fi->type == InformationElement::IeInfo(IPFIX_TYPEID_basicList, 0)) {
                fi->basicListData.semantic = rf->semantic;
                fi->basicListData.fieldIe = new InformationElement::IeInfo(rf->fieldIe);

                // Length is one pointer, as we are storing data in dynamically allocated vector (i.e. pointer to vector)
                fieldLength += sizeof(vector<void*>*);
            }

So the question is can it be extended to handle variable length octetArrays?

It's worth noting that octetArrays fields need not always be variable length, and can be fixed length encoded using a size smaller than 65535 if they want. In the case of the sample data used here the record contains applicationID encoded with a fixed length of 5.

Even if internally vermont treats all octetArrays as variable length, should it be possible to copy into it fixed length encoded otcetArray data?

ogasser commented 6 years ago

@nickbroon Thanks for taking the time to analyze this behavior!

I think we need to fix several things, in order to add support for applicationId and avoid these bug in the future:

nickbroon commented 6 years ago

@ogasser Thanks for the pointers. I'll use these to look at extending the support. First priority will be to induce a hard error when it attempts to parse any 'octetArray' or 'string' type field to avoid the crash. Then I'll look to have parse and aggregate octetArray (of which applicationId is one) and string type fields (they very similar, so should be treatable in the same way).

nickbroon commented 6 years ago

When you say "Use real length in BaseHashtable::createDataTemplate() for variable length IEs other than basicList" I'm not sure I fully understand. What is the real length for the Rule::Field in the case of a octetArray field like applicationId? In a aggregator rule there is no given length, so these are simply marked as 65535 in the Rule::Field. So I'm unsure how to allocate space for this field in hashtable. I suppose it could stored as pointer to dynamic memory, (like for the basicList) and then when actual flows are received for to be inserted into the hashtable, the length can be extracted from them and the memory allocated. (bearing in mind that the field in received flow might be fix length or variable length encode). Or am I missing understanding how rules deal with fields and their encoding and length?

ogasser commented 6 years ago

You are completely right. Since we do not know how long variable length IEs will be, we store them dynamically. Therefore the length of that field is the length of one pointer, similar as for basicList.

nickbroon commented 6 years ago

As an alternative, instead of Vermont always treating fields of type IPFIX_TYPE_string or IPFIX_TYPE_octetArray as variable length encoded it could treat them fix length encoded, with either a hard coded length defined in the code (say 256), or allowing the length to be configured per field in the vermont config used to generate the aggregator rules template, and the field data truncated or zero extended to fill the encoded space.

Examples might be:

<nonFlowKey>
     <ieName>interfaceName</ieName>
     <length>10</length>
</nonFlowKey>
<nonFlowKey>
     <ieName>applicationID</ieName>
     <length>5</length>
</nonFlowKey>
<nonFlowKey>
     <ieName>ipHeaderPacketSection</ieName>
     <length>56</length>
</nonFlowKey>

I'll explore both options ,treating them as variable length encoded like basicList or configurable fixed length encode, and determine which is both best and easiest to implement. For a lot of typically short IPFIX_TYPE_string and IPFIX_TYPE_octetArray fields like applicationID and interfaceName collectors often expect to see fixed length encoded data.

nickbroon commented 6 years ago

While examining the code related to flow aggregation I'm not sure I understand how flowKey/nonFlowKey interacts with the aggregation. I would assume that any field configured as nonFlowKey should be aggregated, and those configured as flowKey not aggregated.

The FlowHashtable::aggregateFlow function that performs the actual aggregation does not appear to take the key status of field into consideration, basing the choice to aggregate on the return value of isToBeAggregated() which appears to use a fixed table of `type.id' to determine this.

flowKey/nonFlowKey appears to only be used in AggregatorBaseCfg::readNonFlowKeyRule and AggregatorBaseCfg::readFlowKeyRule to set ruleField->modifier = Rule::Field::AGGREGATE or ruleField->modifier = Rule::Field::KEEP and then ruleField->modifier is used in FlowHashtable::copyData while building a flow for consideration of inserting/aggregating into the hashtable, but AGGREGATE and KEEP are not treated any different.

Am I misunderstanding something? I simply don't see how flowKey/nonFlowKey configuration is effecting how flows are aggregated together when flow is found in the hash table.

ogasser commented 6 years ago

You are correct, the flowKey and nonFlowKey config directives seem to have no effect as I see it in the code.

nickbroon commented 6 years ago

That's what I suspected. I'll do some testing to confirm this, but if it's the case it would seem to be a pretty serious bug in Vermont!

ogasser commented 6 years ago

Yes, this is definitely not what users expect when configuring the Vermont.

nickbroon commented 6 years ago

Would this be something you'd have time to look into fixing?

ogasser commented 6 years ago

Unfortunately not in the coming weeks :( If you find the time, however, I would make sure to discuss and review with you the proposed fixes.