orymeyer / collective-intelligence-framework

Automatically exported from code.google.com/p/collective-intelligence-framework
0 stars 0 forks source link

free form text parser #176

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
to be used with:

 * blog parser
 * smtp gateway

should

 1. use plugins
 1. convert data types to standardized format (iodef)
 1. return protocol buffer hash with original text embedded
 1. should be a library (Iodef::Text ?)

Original issue reported on code.google.com by saxjazm...@gmail.com on 16 Jul 2012 at 2:14

GoogleCodeExporter commented 9 years ago

Original comment by saxjazm...@gmail.com on 16 Jul 2012 at 2:14

GoogleCodeExporter commented 9 years ago
http://www.codeproject.com/Articles/23198/C-String-Toolkit-StrTk-Tokenizer

Original comment by saxjazm...@gmail.com on 16 Jul 2012 at 7:21

GoogleCodeExporter commented 9 years ago
basic "data types" include, ip-addresses, fqdn's, hashes (uuid,md5,sha1), 
url's, etc..

Original comment by saxjazm...@gmail.com on 24 Jul 2012 at 4:14

GoogleCodeExporter commented 9 years ago

Original comment by saxjazm...@gmail.com on 17 Oct 2012 at 4:14

GoogleCodeExporter commented 9 years ago
https://github.com/collectiveintel/cif-v2/issues/5

Original comment by saxjazm...@gmail.com on 5 Apr 2013 at 2:39