rcc-uchicago / ling-viz

Visualization and UI work for Linguistica
1 stars 1 forks source link

Demonstrate python/D3 interface #2

Open joyrexus opened 10 years ago

joyrexus commented 10 years ago

Ask @JacksonLLee for info about the python code that will be used for morphological analysis. We'd just like to demonstrate how the data from the python-based backend can be piped as JSON to Simon's D3-driven graph renderer / UI proof-of-concept.

Ultimately we'd like to demonstrate how a web-based interface could in principle serve as a front-end UI for the python backend (e.g., a range slider allows a user to specify some parameter input needed for the python-driven analysis).

Note that there are at least two existing python/D3 interfaces:

jacksonllee commented 10 years ago

@JohnAGoldsmith 's Linguistica outputs signatures (e.g., NULL-ed-ing-s) and their associated stems (e.g., "jump" as a stem for the signature NULL-ed-ing-s if all the four words jump-jumped-jumping-jumps are present in the corpus). I'm working on code which

  1. takes this output information from Linguistica plus a GEXF file from make-gephi for word neighbors, and
  2. outputs GEXF files which are the entire graph with the associated wordforms for each signature colored in terms of the affixes.

A sample output is english-brown_All_Words_10000_singleSig_NULL-ed-ing-s_124-84-56-52.gexf. In this example, all wordforms associated with the signature NULL-ed-ing-s in this graph of 10,000 nodes (=words) are colored. Those associated with NULL are in red, -ed in green, -ing in blue, and -s in black.

jacksonllee commented 10 years ago

A much smaller sample .gexf file with a colored signature is now available: english-brown_All_Words_1000_singleSig_NULL-d-s_19-9-3.gexf.