programminghistorian / ph-submissions

The repository and website hosting the peer review process for new Programming Historian lessons
http://programminghistorian.github.io/ph-submissions
140 stars 115 forks source link

Proposed Lesson: "Dealing with Big Data and Network Analysis" #77

Closed ianmilligan1 closed 7 years ago

ianmilligan1 commented 7 years ago

The Programming Historian has received the following proposal for a lesson on 'Dealing with Big Data and Network Analysis' by @jgmjgm. The proposed learning outcomes of the lesson are:

The goal of this proposed entry into the Programming Historian is to teach readers how to do advanced network analysis using large amounts of complex network data. At the end of this lesson readers will be able to construct, analyze and visualize networks based on big — or just inconveniently large — data. All of the steps in the tutorial will be illustrated using techniques that are robust and easily scalable so that readers can deal with large amounts of data.

This tutorial will focus on a handful of open source tools. These include the Neo4j graph database, the Cypher query language used by Neo4j and the Python programming language. Neo4j is a free, open-source graph database written in java that is available for all major computing platforms. Cypher is the query language for the Neo4j database that is designed to insert and select information from the database. Python is a relatively simple c-style language that can be readily used by beginner and advanced programmers alike. This tutorial will also use the point-and-click open source program Gephi to visualize networks.

This tutorial will outline how to read structured data into the Neo4j graph database, how to query the database for information, and how to extract, analyze and visualize relevant components of the graph. Neo4j and the Cypher query language makes it easy to query large graphs and visualize the resulting subgraphs using Python or dedicated graph visualization tools like Gephi.

In order to promote speedy publication of this important topic, we have agreed to a submission date of no later than June 5, 2017. The author agrees to contact the editor in advance if they need to revise the deadline.

If the lesson is not submitted by June 5, the editor will attempt to contact the author. If they do not receive an update, this ticket will be closed. The ticket can be reopened at a future date at the request of the author.

The main editorial contact for this lessons is @ianmilligan1. If there are any concerns from the authors they can contact the Ombudsperson @amandavisconti

ianmilligan1 commented 7 years ago

Closed and continued in #87