Конспектирование собраний, ещё одно. И корпус обучения системы для ведения дебатов #135

«To identify decision elements within a meeting, we annotated a crowd sourced dataset known as the AMI Meeting Corpus (, a multi-modal data set consisting of 100 hours of meeting recordings. We then labelled decision elements from the transcripts as alternatives (options being considered as solutions to the decision) and criteria (factors guiding the alternatives). This annotated corpus was then used to train a set of supervised classifiers for automatically extracting decision making elements. Another algorithm then processes the extracted decision and criteria to identify the expressed sentiment towards the extracted elements. In essence, if a participant mentions a specific alternative, it is important to distinguish whether he or she supports or rather opposes that specific alternative. Finally, a clustering approach is used on each class of extracted elements (alternatives and criteria) to group them semantically. For instance, the mentions of trendy, fashionable or stylish as criteria would be grouped together as they represent the same concept overall.»

Это часть проекта Open Debater, который участвовал в двух показательных дебатах, то есть понимает смысл речи, много знает и умеет сформулировать возражения и контраргументы очень богатый набор корпусов английского текста и голоса, размеченного под нужды семантического и интонационого анализа