Closed choldgraf closed 9 years ago
I will be out of town starting this evening until next Wednesday, our objective is to have one graph done by next week for the presentation. We are still figuring out how to input and graph data in NetworkX.
I just finished making a preliminary graph for the network analysis but it is extremely clutter which i guess is to be expected (over 20000 supplier nodes). So the next step is to find a way to create a legible graph.
That's pretty cool - can you commit some code that you used to create this? Perhaps you can give an explanation of how it works.
Once we hear from @juanshishido and @kaiweitan, I'll put together a plan for the meeting
Have installed anaconda in my macbook. The challenge for me right now is to learn the appropriate tools/code for me to come out with data visualization. I will be down for the meeting today.
So here's the few questions i am working on:
I have been searching through the net for the appropriate code for the following questions.
How to create a new column for Time of procurement = Creating the length of date(Po_closed_date - Creation date)?
Running a graph of Time of procurement against Buyer_Last_Name?
Running a graph of Time of procurement against Department name?
it is cool but it doesn't show much, it should be good enough for the meeting on the 15th though
i will send out more guidelines on what i think needs to be done to improve the graph(s) later this evening
darius
On Fri, Mar 13, 2015 at 9:55 AM, Chris Holdgraf notifications@github.com wrote:
That's pretty cool - can you commit some code that you used to create this? Perhaps you can give an explanation of how it works.
— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79129311 .
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
That would be perfect, thanks Darius
On Fri, Mar 13, 2015 at 11:27 AM, dariusmehri notifications@github.com wrote:
it is cool but it doesn't show much, it should be good enough for the meeting on the 15th though
i will send out more guidelines on what i think needs to be done to improve the graph(s) later this evening
darius
On Fri, Mar 13, 2015 at 9:55 AM, Chris Holdgraf notifications@github.com wrote:
That's pretty cool - can you commit some code that you used to create this? Perhaps you can give an explanation of how it works.
— Reply to this email directly or view it on GitHub < https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79129311
.
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79221574 .
here is something really quick:
graph only three months of one year, i.e. jan, feb, march, you will need to go back to the original data (before you used the drop_duplicates() function) and subset the data with those months and then drop duplicates, if it is still too cluttered, than just do one month
if you can, create dept nodes as shaded circles and supplier nodes as circles without shade but with shaded edges
display the graph where the nodes are displayed according to centrality measure, those with high centrality large and low centrality small (there should be a way in networkx to do this automatically, this is standard in network packages), in this way, we can visualize who the central actors are (although we can't put names to them yet, it will be nice to know if there are a handful of actors who are centrally located)
i am out of the state for a week and am very busy w/ interviews, i can perhaps do some of this but i probably can't do that much until spring break
darius
On Fri, Mar 13, 2015 at 11:29 AM, Chris Holdgraf notifications@github.com wrote:
That would be perfect, thanks Darius
On Fri, Mar 13, 2015 at 11:27 AM, dariusmehri notifications@github.com wrote:
it is cool but it doesn't show much, it should be good enough for the meeting on the 15th though
i will send out more guidelines on what i think needs to be done to improve the graph(s) later this evening
darius
On Fri, Mar 13, 2015 at 9:55 AM, Chris Holdgraf < notifications@github.com> wrote:
That's pretty cool - can you commit some code that you used to create this? Perhaps you can give an explanation of how it works.
— Reply to this email directly or view it on GitHub <
https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79129311
.
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
— Reply to this email directly or view it on GitHub < https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79221574
.
— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79223430 .
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
btw, the centrality measure will probably be for unimodal graphs where we have a two mode graph, don't worry about this, we can figure out later how to do it correctly, it will still give nice visuals
darisu
On Fri, Mar 13, 2015 at 11:49 AM, Darius Mehri darius_mehri@berkeley.edu wrote:
here is something really quick:
graph only three months of one year, i.e. jan, feb, march, you will need to go back to the original data (before you used the drop_duplicates() function) and subset the data with those months and then drop duplicates, if it is still too cluttered, than just do one month
if you can, create dept nodes as shaded circles and supplier nodes as circles without shade but with shaded edges
display the graph where the nodes are displayed according to centrality measure, those with high centrality large and low centrality small (there should be a way in networkx to do this automatically, this is standard in network packages), in this way, we can visualize who the central actors are (although we can't put names to them yet, it will be nice to know if there are a handful of actors who are centrally located)
i am out of the state for a week and am very busy w/ interviews, i can perhaps do some of this but i probably can't do that much until spring break
darius
On Fri, Mar 13, 2015 at 11:29 AM, Chris Holdgraf <notifications@github.com
wrote:
That would be perfect, thanks Darius
On Fri, Mar 13, 2015 at 11:27 AM, dariusmehri notifications@github.com wrote:
it is cool but it doesn't show much, it should be good enough for the meeting on the 15th though
i will send out more guidelines on what i think needs to be done to improve the graph(s) later this evening
darius
On Fri, Mar 13, 2015 at 9:55 AM, Chris Holdgraf < notifications@github.com> wrote:
That's pretty cool - can you commit some code that you used to create this? Perhaps you can give an explanation of how it works.
— Reply to this email directly or view it on GitHub <
https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79129311
.
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
— Reply to this email directly or view it on GitHub < https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79221574
.
— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79223430 .
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
one more point, i think two-mode is the same is bipartite, two-mode is more common usage in sociology, here is the article i am referencing:
http://www.analytictech.com/borgatti/papers/2modeconcepts.pdf
On Fri, Mar 13, 2015 at 11:51 AM, Darius Mehri darius_mehri@berkeley.edu wrote:
btw, the centrality measure will probably be for unimodal graphs where we have a two mode graph, don't worry about this, we can figure out later how to do it correctly, it will still give nice visuals
darisu
On Fri, Mar 13, 2015 at 11:49 AM, Darius Mehri darius_mehri@berkeley.edu wrote:
here is something really quick:
graph only three months of one year, i.e. jan, feb, march, you will need to go back to the original data (before you used the drop_duplicates() function) and subset the data with those months and then drop duplicates, if it is still too cluttered, than just do one month
if you can, create dept nodes as shaded circles and supplier nodes as circles without shade but with shaded edges
display the graph where the nodes are displayed according to centrality measure, those with high centrality large and low centrality small (there should be a way in networkx to do this automatically, this is standard in network packages), in this way, we can visualize who the central actors are (although we can't put names to them yet, it will be nice to know if there are a handful of actors who are centrally located)
i am out of the state for a week and am very busy w/ interviews, i can perhaps do some of this but i probably can't do that much until spring break
darius
On Fri, Mar 13, 2015 at 11:29 AM, Chris Holdgraf < notifications@github.com> wrote:
That would be perfect, thanks Darius
On Fri, Mar 13, 2015 at 11:27 AM, dariusmehri notifications@github.com wrote:
it is cool but it doesn't show much, it should be good enough for the meeting on the 15th though
i will send out more guidelines on what i think needs to be done to improve the graph(s) later this evening
darius
On Fri, Mar 13, 2015 at 9:55 AM, Chris Holdgraf < notifications@github.com> wrote:
That's pretty cool - can you commit some code that you used to create this? Perhaps you can give an explanation of how it works.
— Reply to this email directly or view it on GitHub <
https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79129311
.
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
— Reply to this email directly or view it on GitHub < https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79221574
.
— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79223430 .
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
Thanks for the feedback Darius - we'll talk about it at the meeting and can get thoughts back to you.
@juanshishido, are you planning to video-chat into this meeting? CHris
OK, here's a brief plan for today. We'll be meeting at BIDS at 1:30pm (in 9 minutes so hopefully you already knew this haha).
Total meeting time should hopefully be about 30-40 minutes. See you guys soon.
Yeah the BIDS presentation tomorrow and BIDS social teas the monday after Spring Break are all great venues to get feedback and ask for help from experts in the field. If there are any real roadblocks it might be worth sending a message out the BIDS Slack.
Cheers,
Anthony
On Fri, Mar 13, 2015 at 1:24 PM, Chris Holdgraf notifications@github.com wrote:
OK, here's a brief plan for today. We'll be meeting at BIDS at 1:30pm (in 9 minutes so hopefully you already knew this haha).
- Admin stuff (5 min)
- Update from Juan/Kai meeting + brainstorm page (5 min)
- BIDS update tomorrow (10 min) - I'm giving a short presentation tomorrow, let's talk about some challenges we've faced, things worth mentioning
- Project updates / challenges (20 min) - Talk about what we've tried so far, and what challenges we're facing / what we need help with.
Total meeting time should hopefully be about 30-40 minutes. See you guys soon.
— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79347203 .
Anthony Suen
please mention that since we have time-series data, we can potentially do some really cool stuff (eventually) with network analysis, ie see how the structure and measures change over time, by month, week and even day (if we get hardcore)
darius
On Fri, Mar 13, 2015 at 1:30 PM, anthonysuen notifications@github.com wrote:
Yeah the BIDS presentation tomorrow and BIDS social teas the monday after Spring Break are all great venues to get feedback and ask for help from experts in the field. If there are any real roadblocks it might be worth sending a message out the BIDS Slack.
Cheers,
Anthony
On Fri, Mar 13, 2015 at 1:24 PM, Chris Holdgraf notifications@github.com wrote:
OK, here's a brief plan for today. We'll be meeting at BIDS at 1:30pm (in 9 minutes so hopefully you already knew this haha).
- Admin stuff (5 min)
- Update from Juan/Kai meeting + brainstorm page (5 min)
- BIDS update tomorrow (10 min) - I'm giving a short presentation tomorrow, let's talk about some challenges we've faced, things worth mentioning
- Project updates / challenges (20 min) - Talk about what we've tried so far, and what challenges we're facing / what we need help with.
Total meeting time should hopefully be about 30-40 minutes. See you guys soon.
— Reply to this email directly or view it on GitHub < https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79347203
.
Anthony Suen
— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/10#issuecomment-79353284 .
Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley
Hey guys - @nlin3330, @kaiweitan and I met today to talk about updates on the project and address some questions that people had. Here are some general updates:
Talk to you guys soon, Chris
Apologies for not commenting earlier. I had a big project to finish by 4pm today. For some reason, I didn't have today's meeting on my calendar. Maybe someone can send me that?
The notebook I updated needs a lot more. My plan is to:
Thanks all for your thoughts. Depending on how #11 goes, we'll meet on Thursday or Friday of this week. BIDS checkpoint is coming up, so we should put together preliminary analyses and make them look pretty / interesting, even if they're not finalized. Then we can use this in our presentation.
This will be a short update meeting to discuss progress on the following projects:
7 Network analysis
8 Time to completio analysis
9 Text classification analysis
Please post your progress here, as well as challenges that you've faced and things that need to be done next. As a group we can discuss these tomorrow.
@juanshishido @nlin3330 @dariusmehri @kaiweitan @anthonysuen
Chris