SiRumCz / CSC501

CSC501 assignments
0 stars 1 forks source link

graph works #70

Closed jonhealy1 closed 4 years ago

jonhealy1 commented 4 years ago

Instructions are in the readme. I created links between source and target subreddits. The links contain the link sentiment and timestamp. The original tsv file needs to be put into the imports folder to create the db. Also the memory in preferences in docker needs to be increased to 4 GB.

USING PERIODIC COMMIT 500 LOAD CSV WITH HEADERS FROM "file:////soc-redditHyperlinks-title.tsv" as row FIELDTERMINATOR "\t" MERGE (s:Subreddit{id:row.SOURCE_SUBREDDIT}) MERGE (t:Subreddit{id:row.TARGET_SUBREDDIT}) CREATE (s)-[:LINK{post_id:row.POST_ID, link_sentiment:toInteger(row.LINK_SENTIMENT), date:localDateTime(replace(row['TIMESTAMP'],' ','T'))}]->(t)

jonhealy1 commented 4 years ago

Hopefully now we can start to move ahead with this assignment now. There is an examine_data.py to use after load_data.py.

jonhealy1 commented 4 years ago

graph

jonhealy1 commented 4 years ago

These are just links to ffiv.

SiRumCz commented 4 years ago

I am getting errors with python3 load_data.py.

py2neo.database.ClientError: ExternalResourceFailed: Couldn't load the external 
resource at: file:/var/lib/neo4j/import/soc-redditHyperlinks-title.tsv

oops, nvm, I forgot to put data file into the folder

jonhealy1 commented 4 years ago

You have to manually put the file in the imports folder because it's too big for github ~ 330 mb

soroushysfi commented 4 years ago

Awesome! we'll start making some endpoints and coming up with queries and visualize them. Thanks Jonathan.