SiRumCz / CSC501

CSC501 assignments
0 stars 1 forks source link

Goals for assignment #86

Open jonhealy1 opened 4 years ago

jonhealy1 commented 4 years ago

What is everyone's schedule this week? How much time do you see yourself putting into this assignment before it's due? I thought that we would get a lot more work done this weekend.

Extensions sometimes aren't too helpful because many people are going to do the bulk of the work in the last two days before an assignment is due. If the due date is moved back people will stop working on something until it's a day or two before the new due date.

Do you guys have time this week? I'm supposed to be in a sprint online from the 5th. to the 7th. as part of my internship. What can we accomplish?

SiRumCz commented 4 years ago

I have time tomorrow and Wednesday. Actually I was working on my other project during the weekend which also involves graph down-sampling, clustering, and visualizing. My implementation used relational database (postgresql) for data storing, networkx python library for graph manipulating, and d3.js for visualizing. Here is the link of the code to my graph manipulation: https://github.com/SiRumCz/Researches-on-NPM-public-packages/blob/master/visualization/flask/app.py. This library works very similar to neo4j, and it has the advantage of the handy python ready-to-use functions. It also handles very well on large graph(my data has 470,000+ nodes and 2,000,000 edges compares to >40,000 nodes from the assignment data). Because of this, I feel that we should seriously consider migrating our current work from neo4j algorithms to the built-in algorithms this library has. I think the cost won't be too much compares to the learning process for cypher. I can start working on it tonight and come up with one example of the visualization data before you guys agree on my proposal.

Kevin

jonhealy1 commented 4 years ago

I am 10,000% against switching libraries and throwing away what we have 3 days before the project is due. Why didn't you implement some of this stuff in our project earlier so we could at least have time to figure everything out?

I've spent countless hours working with neo4j now. It's not THAT difficult. Postgres is not a graph database!

jonhealy1 commented 4 years ago

What did you do/ accomplish with your project that you think we should do or can't do with ours?

jonhealy1 commented 4 years ago

@SiRumCz Can you just start working on our assignment? I am pretty busy for the next few days. We have lots of algorithms implemented. Finishing Soroush's graphs is not going to be too hard. Your plan involves throwing away everything I've done. If there's something you think we need to have done I could probably get it done, I just need to know.

SiRumCz commented 4 years ago

The problem I had with neo4j right now is that I am confused with the syntax of cypher and don't know how to use cypher to get the data for visualization.

On Mon., Nov. 4, 2019, 2:52 p.m. Jonathan Healy notifications@github.com wrote:

What did you do/ accomplish with your project that you think we should do or can't do with ours?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/SiRumCz/CSC501/issues/86?email_source=notifications&email_token=AEWWYWXA3IIKTV7SGNJZZ53QSCRTDA5CNFSM4JIW72JKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDA72UI#issuecomment-549584209, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEWWYWQWJYB2GUQC2GQTYYDQSCRTDANCNFSM4JIW72JA .

jonhealy1 commented 4 years ago

That's why I keep saying we should just use pandas to get the weights and the adjacency matrix.

soroushysfi commented 4 years ago

I think first we should come up with some questions that what insights we want to get out of data. i.e. which are the first 100 nodes that have the highest impact or which relations have the highest word count, or as the one that Jonathan did that was really good. The next step would be converting these questions to queries and what data manipulation we need. After that we would just need to send the data to front end to visualize it. Don't worry about the format I will change the format myself in frontend.

SiRumCz commented 4 years ago

@SiRumCz Can you just start working on our assignment? I am pretty busy for the next few days. We have lots of algorithms implemented. Finishing Soroush's graphs is not going to be too hard. Your plan involves throwing away everything I've done. If there's something you think we need to have done I could probably get it done, I just need to know.

I know you have been putting lots of effort on the neo4j and I appreciate your works. This is why I am only making a proposal in the first place, I don't want to throw away them either. I just talked to Soroush, and he made a suggestion that might just makes the whole thing works.

soroushysfi commented 4 years ago

And since Jonathan has more knowledge in cypher than me and Kevin, after knowing what we want to show Jonathan can lead us how can we achieve the cypher queries.

jonhealy1 commented 4 years ago

We did a ton of algorithms with the cypher queries. I'm not sure what you guys want. A lot of the data that Soroush needs we can easily do in pandas. The lpa community detection should be done on a pruned graph. It's easiest to prune the data from the csv and then put it into neo4j.

jonhealy1 commented 4 years ago

Using pandas I collapsed the data down into total link sentiment for each connection - now if we put that into neo4j it will look much better.

soroushysfi commented 4 years ago

I agree with you we have a lot of things going on. Let's think about what insights we want to get so that we can turn them to cypher.

jonhealy1 commented 4 years ago

Like worldnews -> india had 14 rows before. Now it just has one with a weight of 14.

jonhealy1 commented 4 years ago

I think you guys should concentrate on the graphs. Send the data from flask to d3. Forget about cypher

jonhealy1 commented 4 years ago

We don't have to just use one thing. We used cypher - we can use pandas - we can use a little networkx. I watched a talk it said developers usually use relational databases along with neo4j.

soroushysfi commented 4 years ago

Yeah that's totally true. So what questions are we answering with our visualizations? I mean what do you guys suggest we should query from our data?

jonhealy1 commented 4 years ago

You need to select small subsets of nodes to use for the visualization. The one I did is by topic. I am going to change the one I did because I am going to make a graph for positive weights and one for negative weights instead of just a total weight score.

jonhealy1 commented 4 years ago

I think what other groups are talking about with having the busiest nodes is not really very insightful.

jonhealy1 commented 4 years ago

If we could identify super high traffic between nodes surrounding certain dates that would be pretty cool. Maybe try to identify an event that triggered an outburst.

SiRumCz commented 4 years ago

If we could identify super high traffic between nodes surrounding certain dates that would be pretty cool. Maybe try to identify an event that triggered an outburst.

that is cool idea, we could look for few global or reddit news, and identify the post events happened within the timeframe (post time of the news until 1 month(or other timeframe) later) and compares it to average traffic in general.

soroushysfi commented 4 years ago

If we could identify super high traffic between nodes surrounding certain dates that would be pretty cool. Maybe try to identify an event that triggered an outburst.

This sounds pretty cool. So one thing I couldn't get was we're not using properties section of the data? because I couldn't find anything in neo4j. Because using properties would give us some insights.

SiRumCz commented 4 years ago

If we could identify super high traffic between nodes surrounding certain dates that would be pretty cool. Maybe try to identify an event that triggered an outburst.

This sounds pretty cool. So one thing I couldn't get was we're not using properties section of the data? because I couldn't find anything in neo4j. Because using properties would give us some insights.

no we have not, but it shouldnt be hard to do, just need to know which data in the property would be useful.

jonhealy1 commented 4 years ago

If we could identify super high traffic between nodes surrounding certain dates that would be pretty cool. Maybe try to identify an event that triggered an outburst.

This sounds pretty cool. So one thing I couldn't get was we're not using properties section of the data? because I couldn't find anything in neo4j. Because using properties would give us some insights.

no we have not, but it shouldnt be hard to do, just need to know which data in the property would be useful.

If you import the data into excel you can separate the properties into separate columns and then import that into python via csv to use the properties you want. @soroushysfi are you mostly a Javascript guy and not a Python guy?

jonhealy1 commented 4 years ago

@SiRumCz @soroushysfi Ok, what are our goals for this assignment? I would really like to know what you guys are working on.

jonhealy1 commented 4 years ago

@SiRumCz @soroushysfi Are you guys around tomorrow to go talk to Sean between 11 and 12?

soroushysfi commented 4 years ago

If we could identify super high traffic between nodes surrounding certain dates that would be pretty cool. Maybe try to identify an event that triggered an outburst.

This sounds pretty cool. So one thing I couldn't get was we're not using properties section of the data? because I couldn't find anything in neo4j. Because using properties would give us some insights.

no we have not, but it shouldnt be hard to do, just need to know which data in the property would be useful.

If you import the data into excel you can separate the properties into separate columns and then import that into python via csv to use the properties you want. @soroushysfi are you mostly a Javascript guy and not a Python guy?

Yeah I have mostly worked with javascript and have little experience with python.

soroushysfi commented 4 years ago

@SiRumCz @soroushysfi Ok, what are our goals for this assignment? I would really like to know what you guys are working on.

The top level goal I think is to understand how to work with graph data, how to extract meaning out of them, what algorithms are used and in what ways we can visualize them. Since I am mostly interested visualization I wanted to focus on that part but I've learned a lot from the work you guys did in other parts and trying to read more. What do you think?

soroushysfi commented 4 years ago

@SiRumCz @soroushysfi Are you guys around tomorrow to go talk to Sean between 11 and 12?

I have TA from 10:30, sorry. Are you going to talk to Sean about the goals of the assignment?

jonhealy1 commented 4 years ago

@SiRumCz @soroushysfi Ok, what are our goals for this assignment? I would really like to know what you guys are working on.

The top level goal I think is to understand how to work with graph data, how to extract meaning out of them, what algorithms are used and in what ways we can visualize them. Since I am mostly interested visualization I wanted to focus on that part but I've learned a lot from the work you guys did in other parts and trying to read more. What do you think?

I'm mostly interested in cryptocurrency but that doesn't mean I'm not going to learn everything I can from every course I take and do everything I can to help with group projects.

jonhealy1 commented 4 years ago

@SiRumCz @soroushysfi Are you guys around tomorrow to go talk to Sean between 11 and 12?

I have TA from 10:30, sorry. Are you going to talk to Sean about the goals of the assignment?

I wanted to talk to him about how we can get everyone in the group to work together because I feel like you aren't doing as much work as everyone else. I didn't talk to him.

soroushysfi commented 4 years ago

@SiRumCz @soroushysfi Ok, what are our goals for this assignment? I would really like to know what you guys are working on.

The top level goal I think is to understand how to work with graph data, how to extract meaning out of them, what algorithms are used and in what ways we can visualize them. Since I am mostly interested visualization I wanted to focus on that part but I've learned a lot from the work you guys did in other parts and trying to read more. What do you think?

I'm mostly interested in cryptocurrency but that doesn't mean I'm not going to learn everything I can from every course I take and do everything I can to help with group projects.

I want to clear out some stuff because I still think we have some problems in our group. First when we formed our group for the first assignment(me, Kevin and Will) we all talked about what we are interested in and what our skills are. We decided that if we want to do this much work for the assignment, it is better we give each and every team member clear tasks and responsibilities(despite what we are doing right now because I feel you want every team member to pitch in in every part and do random stuff to make progress). By doing it this way we would talk about ideas, algorithms, data models,... as a group and implement it individually(which in my opinion is much faster than if everyone was doing everything!).This method worked for the first two assignments and we got excellent marks(and then again you were panicking for the previous assignment but we a got A+ at the end!). This method works for many different reasons. One is we won't be wasting time on subjects that we have no experience(why invent the wheel when others have more experience in that area and can do more in less time). Secondly, progress would be much faster because we will not have the learning curve in each assignment. Finally, for the time limit we have (which is two weeks!) this works really well, if we had more time (like a month) I would've agreed with you. So these are my reasons that why we should work like this. If anyone has any opposing ideas I would be happy to hear their reasons. I want to point out that it's not that I don't wan to learn new stuff. If I didn't want to learn new stuff I would not have started my PhD and I would probably be working as a developer in a company. It's about being efficient and getting the most out of what we have. I know how to code python, I teach python to first year students but I'm probably not that expert as you and @SiRumCz . As Sean said his expectation for the assignments are not that high! I don't know how many hours you put in for the assignments, I'm guessing more than 10 hours per week. If you think it's really good for you and you are learning lots of stuff that you really like I'm really glad for you. But honestly I think 10-14 hours for one assignment is enough(which is 10-14 hours in 15 days not 7days!). I don't want to spend 24/7 on one assignment(because as a graduate student I have other responsibilities). If you feel like we are not contributing as much as you and you want to get more credit I think we cannot go forward as a group. If you and Kevin want to work together that's fine with me. But I think I cannot move forward with this group. You expect other team members to work on the assignment every day which is not possible for me.

soroushysfi commented 4 years ago

After we hand in the assignment tomorrow, I'll tell Sean that I'll be leaving the group.

jonhealy1 commented 4 years ago
  1. Soroush we have a final project coming up. You just want to do visualizations. It's not fair for you to start working on them 2 days before the project is due and then like you said - you were super busy with other things today.

  2. We did do good on the last assignment. I feel like that we did good mostly because Kevin did such a great job on the report. The maps helped too, mostly the videos.

  3. You admit that you know Python. I am not a Python expert. I am not a database expert. You want to do visualizations but you need to know how to manipulate the data behind your visualizations. Not knowing how to do this is like wanting to play soccer without knowing how to run.

  4. I put a ton of hours into learning neo4j - even getting it running in Docker with the right settings took awhile. I thought we were using it for the assignment. You knew that it was a HUGE learning curve. Kevin was helping, he was paying attention to what was going on, he was learning some cypher stuff with the graph algorithms and he was learning networkX which was obviously super helpful. Unlike most other groups we could probably easily do graph databases as part of our final project. You weren't paying too much attention, you knew how much work we had, you thought that you would just put in a couple of half days once we were finished and this was not fair of you!

  5. I don't need more credit. I am happy to be learning all of this but I don't like carrying someone. I'm a Master's student and I have things going on as well. If me and Kevin both said - ok let's put 5 hours each total into this assignment - we'd all be screwed. I have no idea what you're doing tomorrow, if you have any time to work on this, or if you will have time to help with the final report and that's not fair. I feel like you're taking advantage of us.

  6. If you want to start working on the final project right away and not leave your part right to the end let me know. I'm not in school to do things half ass. I could really help you learn a lot about doing visualizations but you don't even want to listen or spend time on it. It's not fair that it's like 1 day before the project is due and I feel like I have to be up late working on visualizations which is the only thing you think that you should work on but you're not spending enough time doing them. Neo4j was running more than a week ago.

soroushysfi commented 4 years ago
  • Soroush we have a final project coming up. You just want to do visualizations. It's not fair for you to start working on them 2 days before the project is due and then like you said - you were super busy with other things today.
  • We did do good on the last assignment. I feel like that we did good mostly because Kevin did such a great job on the report. The maps helped too, mostly the videos.
  • You admit that you know Python. I am not a Python expert. I am not a database expert. You want to do visualizations but you need to know how to manipulate the data behind your visualizations. Not knowing how to do this is like wanting to play soccer without knowing how to run.
  • I put a ton of hours into learning neo4j - even getting it running in Docker with the right settings took awhile. I thought we were using it for the assignment. You knew that it was a HUGE learning curve. Kevin was helping, he was paying attention to what was going on, he was learning some cypher stuff with the graph algorithms and he was learning networkX which was obviously super helpful. Unlike most other groups we could probably easily do graph databases as part of our final project. You weren't paying too much attention, you knew how much work we had, you thought that you would just put in a couple of half days once we were finished and this was not fair of you!
  • I don't need more credit. I am happy to be learning all of this but I don't like carrying someone. I'm a Master's student and I have things going on as well. If me and Kevin both said - ok let's put 5 hours each total into this assignment - we'd all be screwed. I have no idea what you're doing tomorrow, if you have any time to work on this, or if you will have time to help with the final report and that's not fair. I feel like you're taking advantage of us.
  • If you want to start working on the final project right away and not leave your part right to the end let me know. I'm not in school to do things half ass. I could really help you learn a lot about doing visualizations but you don't even want to listen or spend time on it. It's not fair that it's like 1 day before the project is due and I feel like I have to be up late working on visualizations which is the only thing you think that you should work on but you're not spending enough time doing them. Neo4j was running more than a week ago.

I'm not going to talk about the first assignment because you joined us from the second one. From what I'm hearing is that I did not contribute anything to assignments and all the grades we got was from the tech report and the videos you made! I'm really glad to hear that more than half of our interactive visualizations was useless and didn't affect our mark! If that's the case me staying in the group is useless because I put burden on you guys and it seems that I have nothing to contribute. When we started to work on the third assignment, I don't know if you remember, I suggested to work on a graph database(I didn't know which one too choose, Orientdb or Neo4j). First I tried to make progress by importing data to Orientdb which I wasn't successful(because of the poor community for this database). When I was trying to work with neo4j I saw you made a good progress on neo4j, so I started to study cypher and extract info out of it. After one week Kevin told me he's going to be working on the backend so I decided to make mock visualizations(not 2 days before deadline). So some of our visualizations were ready but it had some problems. Then I tried to make modifications on our visualizations and at the same time study cypher. I didn't make progress on cypher but that doesn't mean I didn't put effort to learn it at all! I'm sorry that you'r feeling I'm a burden, unfortunately I don't feel that way and I would gladly leave the group because apparently I don't contribute and I'm being carried by someone else(which for the second assignment you thought Kevin was not doing anything). I will do as much as I can for this assignment until the deadline but for the fourth assignment and the project I will leave this group.

jonhealy1 commented 4 years ago
  1. I didn't feel like Kevin wasn't doing anything. I felt like he was being rude and was refusing to work with me.

  2. Your mock visualizations were just code you found online. I asked you to start playing with data that we were working on so we could get an idea about how they would look etc. I told you about Neovis more than a week ago - you didn't want to look at it. We can do some graphs in the neo4j browser - you didn't do anything with that.

  3. You played with cypher a little. I'm sorry but I don't think you put even a third of the time that me or Kevin put in. At one point you were using a super old version of the database. You're leaving it super late to do the visualizations. You could have started them a week ago and made them look way better.

  4. For the second assignment you ported a map that I figured out in Python over to Javascript with basically the same library. I thought that was great. I felt like you didn't do a whole lot after that. You're a visualization guy but you weren't really into knowing how I did the other maps. I don't think you tried to move any other maps into javascript. I was really disappointed in you when Kevin seemed to do the whole report. My opinion of Kevin did change after that and I apologized to you guys and Sean. I can see the work he's done now for this assignment and it's really impressive.

  5. Yea you told me you had worked with some graph databases before so it was really disappointing that you just decided to let me learn everything on my own.

  6. I'm sorry for not thinking that you have contributed enough. I think you should prove me wrong and start working with us right away on the final project. You have left this all to the last minute for this assignment.

jonhealy1 commented 4 years ago

Honestly Sean did say he would prefer that code was in a Jupyter notebook. The React stuff doesn't count for any marks. I like it because I think it's cool. You want to complain about time that assignments take but most of your time goes into doing something that is not giving us a better grade. For this assignment I guess we do need d3.js and neovis.js.

soroushysfi commented 4 years ago
  • I didn't feel like Kevin wasn't doing anything. I felt like he was being rude and was refusing to work with me.
  • Your mock visualizations were just code you found online. I asked you to start playing with data that we were working on so we could get an idea about how they would look etc. I told you about Neovis more than a week ago - you didn't want to look at it. We can do some graphs in the neo4j browser - you didn't do anything with that.
  • You played with cypher a little. I'm sorry but I don't think you put even a third of the time that me or Kevin put in. At one point you were using a super old version of the database. You're leaving it super late to do the visualizations. You could have started them a week ago and made them look way better.
  • For the second assignment you ported a map that I figured out in Python over to Javascript with basically the same library. I thought that was great. I felt like you didn't do a whole lot after that. You're a visualization guy but you weren't really into knowing how I did the other maps. I don't think you tried to move any other maps into javascript. I was really disappointed in you when Kevin seemed to do the whole report. My opinion of Kevin did change after that and I apologized to you guys and Sean. I can see the work he's done now for this assignment and it's really impressive.
  • Yea you told me you had worked with some graph databases before so it was really disappointing that you just decided to let me learn everything on my own.
  • I'm sorry for not thinking that you have contributed enough. I think you should prove me wrong and start working with us right away on the final project. You have left this all to the last minute for this assignment.

Visualizations are from online samples, but do you know what is the process of getting all the visualizations from different versions with their different data format in one place and make it work with the lates version? Some use csv, tsv or even raw txt as an input. Some of them work with the second version of d3, some of them work with version 4, and some work with version 3. So making all these changes actually needs effort and time if you didn't know. I never said using React would give us marks, I always said by using react component base library we can have reusable components to use in future works(for example if we wanted to use graph visualization in our final project we would simply copy and paste our component into our new project). I explained all the reasons in our tech report, you can take a look at it. I really liked the videos you rendered and I am definitely sure they affected our grades. But it's really unfair to say that three different videos with different level of zoom on one map made most of the marks we got. You are not mentioning heat maps, d3 charts, and the interval trees at all, which is really unfair and I'm really sad that you think this way. I cannot work in a group where my work is not appreciated at all. It is really shocking to me that you say we got 4/4 in visualization just because we had videos! I am sorry that you think I worked only on the last minute. I did work during these two weeks and I contributed as much as I could for this assignment. If you think my work is useless and doesn't make any sense so I don't think it is good for me to continue with you guys. I have nothing to prove and I did my part as much as I could and I hope you guys make great progress on the fourth assignment and the project.

jonhealy1 commented 4 years ago

Why can't you just admit you could have started getting serious about this assignment earlier. We have a final project coming up and starting on it at the last minute is not great. Maybe you deserve more credit for assignment 2 and I'm sorry for that. You have not contributed to this assignment enough - I'm sorry but you haven't. Just admit it and move on - no one is going to hold it against you. What you did this morning or last night should have been done a week ago.

jonhealy1 commented 4 years ago

Kevin was approving all my pull requests - there were quite a few - and you didn't even realize that they could be approved with only 1 approval until the last minute. I know when someone is not putting effort into something. You put some effort in sure. If you're writing the report tomorrow then maybe you will make up for lost time however I have asked you numerous times if you will have time tomorrow and you have not answered.

jonhealy1 commented 4 years ago

You need to start working with Python.

soroushysfi commented 4 years ago

Why can't you just admit you could have started getting serious about this assignment earlier. We have a final project coming up and starting on it at the last minute is not great. Maybe you deserve more credit for assignment 2 and I'm sorry for that. You have not contributed to this assignment enough - I'm sorry but you haven't. Just admit it and move on - no one is going to hold it against you. What you did this morning or last night should have been done a week ago.

I will not admit something which it was not according to your expectations. I'm going to rephrase what Sean said previous week: "students are putting too much effort on the assignments". He said the expectations are lower and you don't have to work on the assignments 20 hours per week. You can but you don't have to. So if you want to work on the projects 20 hours per week it doesn't mean I have to do to! and this doesn't mean I did nothing! I contributed for the part I was in charge of and I know I did my part to a point to get 4/4 on the visualization. Because it also covers research papers. I am gonna say it again if you haven't looked at the tech report you should, I wrote almost 8 pages. When I asked why did you merge the branch and I asked about reviews I was just being polite to let you know that PRs do need two reviews:

Screen Shot 2019-11-07 at 12 00 57 AM

I don't know how do you translate the first sentence: "At least two approving reviews are required"! this is how we did in the first assignment and obviously this means you're not familiar how git works. So whenever you merge a request with only on review this means it is not important to you the third reviewer reviews the code! even once you tried to merge my branch! that obviously proves my point! Thank you for mentioning what should I do, I don't think I need that anymore since I'm not working with you anymore.

jonhealy1 commented 4 years ago

If we would have waited on the third review from you we would be waaaaaay behind right now.

jonhealy1 commented 4 years ago

You are not the only reason we might get good marks for the visualizations - you are at most 1/3 of the reason. My visualization stuff is just as good and Kevin did the grunt work for yours to even work. You could have done the apis yourself last week but you were being lazy.

soroushysfi commented 4 years ago

If we would have waited on the third review from you we would be waaaaaay behind right now.

I am not continuing this discussion any more because it looks like we are not understanding each other. I will leave the group and hope the best for you. Have a good night.

jonhealy1 commented 4 years ago

Haha yea ok.

jonhealy1 commented 4 years ago

Why are you mad at Kevin? You're putting him in a pretty crappy position. Just make a commitment to start working on the final project right away and everyone will be happy.

jonhealy1 commented 4 years ago

If we would have waited on the third review from you we would be waaaaaay behind right now.

I am not continuing this discussion any more because it looks like we are not understanding each other. I will leave the group and hope the best for you. Have a good night.

There are all these pull requests that are being approved without you. You didn't think they could be approved with only 2 people. I approved a bunch of Kevin's too. Why didn't you think of this?