GaloisInc / adapt

ADAPT software for Transparent Computing
BSD 3-Clause "New" or "Revised" License
6 stars 3 forks source link

Automate SPADE CDM Collection #12

Closed TomMD closed 8 years ago

TomMD commented 8 years ago

To get data we can either get cdm files from CADETS or ProvN files from SPADE or live CDM streams from SPADE. AFAICT, we actually don't really want any of that, sadly.

Ideally, we should acquire CDM files from SPADE. This is complicated by the fact that SPADE does not naturally output CDM, but the basic components are all there to solve this problem by either:

  1. Adding CDM output to SPADE and upstreaming the patch to Ashish.
  2. Scripting SPADE to record data to a raw log, then we can stream the Kafka not live, but form this log to obtain similar behavior.
  3. Write a Kafka -> CDM Container Object helper. This would probably be best to upstream to ta3.
TomMD commented 8 years ago

Our Adapt-SPADE repository now has a version of SPADE that can write to an Avro container object directly (option 1). However, Trint can't read this file and it appears to be in a very slightly different format than the Avro container objects produced by CADETs. There exists some schema disagreement between the CADETs and SPADE system (and thus our system) that needs identified.

TomMD commented 8 years ago

The intended changes are done from our side.

  1. SPADE can now save to CDMFile (CDM 1.0 currently). See, for example, AdaptMisc.git/scenarios/bad-ls.
  2. Adapt-Ingest (Trint) can now read CDM 1.0.
  3. adapt.dev.galois.com is using the SPADE fork with CDMFile, which is not yet upstream. See @GaloisInc/SPADE