infochimps-labs / icss

Infochimps Stupid Schema library: an avro-compatible data description standard. ICSS completely describes a collection of data (and associated assets) in a way that is expressive, scalable and sufficient to drive remarkably complex downstream processes.
http://infochimps.com
MIT License
9 stars 1 forks source link

ICSS + Gorillib migration #1

Closed mrflip closed 13 years ago

mrflip commented 13 years ago

pushed the new gorillib and icss + gorillib to master.

Integration points are george, apeyeye, hackboxen, buzzkill and troop.

active_support vs gorillib

For george, check that the icss/core_ext isn't included -- see infochimps/george#4

If your library is non-rails yet you feel it can't be migrated off active_support or extlib, please let me know.

core extensions

In hackboxen, troop and apeyeye, you'll see some cases where perhaps it used to know about symbolize_keys and now it doesn't, etc. Gorillib requires you to explicitly enumerate how you're extending base classes, so either in icss/core_ext.rb (if its in icss) or in your project, add the appropriate require explicitly.

Foo.receive(*constructor_args, hsh)

The signature of class-level Foo.receive has changed. The class-level method creates a new instance obj = self.new, and then invokes obj.receive!(hsh) on the instance. Some, but a very few, places want to pass in constructor args. Foo.receive's signature used to be receive(hsh, *constructor_args), but that is a) non-vernacular, b) might leave you thinking the hsh gets applied first. So it's now receive(*constructor_args, hsh), using the extract_options! pattern.

I scanned through to find out where the fancy syntax was being used, and though I think I got them all may have missed some.

Hashlike

mrflip commented 13 years ago

(paging @dsnyder90 @bollacker @kornypoet @aseever @Ganglion)

mrflip commented 13 years ago

ICSS lib now roundtrips to/from JSON, and I made it a spec in icss/spec. (which will break for you, cause your paths are different, I'll trust you to do something more polite @kornypoet)

there's some stuff in sample_message_call that should only be there for testing -- message_samples_hash and all the actual fetching. Basically, anything that knows there's an internet.

In the hackboxen, a couple config.yaml files had features that made icss angry. I fixed two (geonames catalog and grand comics db) but didn't want to touch the MSDS one; here's my diff:

diff --git a/hb/engineering/chemical/msds/hazard_msds/config/config.yaml b/hb/engineering/chemical/msds/hazard_msds/config/config.yaml
index 3f4b71d..c9280bd 100644
--- a/hb/engineering/chemical/msds/hazard_msds/config/config.yaml
+++ b/hb/engineering/chemical/msds/hazard_msds/config/config.yaml
@@ -314,16 +314,6 @@ types:
     doc: Reference number in the Registry of Toxic Effects of Chemical Substances (RTECS) database
     type: string
   type: record
-- name: ingredients_list_record
-  doc: |-
-    A list of chemical ingredients. This list is not comprehensive and is often empty
-    when the chemical product contains no hazardous ingredients.
-  fields: 
-  - name: ingredients
-    items: ingredient
-    doc: An array of ingredients
-    type: array
-  type: record
 - name: first_aid_record
   doc: First aid measures in case of exposure to the chemical product
   fields: 
@@ -390,7 +380,13 @@ types:
   - name: firstaid
     type: first_aid_record
   - name: ingredients
-    type: ingredients_list_record
+    doc: |-
+      A list of chemical ingredients. This list is not comprehensive and is often empty
+      when the chemical product contains no hazardous ingredients.
+    type: 
+    - type: array
+      items: ingredient
+  type: record
   - name: accidentalrelease
     type: accidental_release_record
   - name: personalprotection
mrflip commented 13 years ago

(if there really is an ingredients_list_record field then its type should be fixed; it's invalid avro)