BIDS-collaborative / purchasing

Working with Andrew Clark on optimizing purchasing with data
ISC License
2 stars 1 forks source link

Meetings for the week of March 16th - Team updates #12

Closed choldgraf closed 9 years ago

choldgraf commented 9 years ago

Here we will discuss updates from each team meeting. These will be:

@juanshishido and me on Wednesday at 3:00pm in BIDS @kaiweitan as a personal update @nlin3330 and me on Thursday at 3:30pm in BIDS @dariusmehri as a personal update

Please post any and all progress, questions, and comments here, and I will update after the meetings.

NOTES FROM MEETINGS

To Do @me

  1. send Nick Adams an e-mail about text processing-
  2. figure out what "Service Order Request" is. If it means maintenance, see if it's seasonal at all.
  3. Figure out what "coaches corporate credit cards" is
  4. Spend vs. quantity*unit_price...why are these different?

@juanshishido

  1. Come up with a hand-made word to category dictionary and see how this does
  2. Create a list of most probable words
  3. Use description along with manufacturer name
  4. Come up with a list of categories we want to use
  5. Split up word probabilities by department
  6. Preprocess text data and remove unwanted words/characters
  7. Create a word frequency matrix
  8. Run LDA / clustering on this

@nlin3330 & @dariusmehri

  1. Play with color coding of networks to make it more readable
  2. Come up with an image / plot that shows something interesting for presentation on the Monday after spring break..
  3. Start playing around with an analysis of average price per item @nlin3330

Note: Next week is spring break. I'll send out e-mails about meeting, but I'm assuming people may be gone. Be in touch via github so we can plan out what we'll present at the BIDS checkpoint meeting on the Monday after spring break.

testchange commented 9 years ago

i have yet to work on the code yet but i will submit latest by tomorrow morning.

choldgraf commented 9 years ago

Sounds good - we'll await your project update tomorrow morning.

choldgraf commented 9 years ago

Notes from Chris and Juan's meeting:

To Do @me

  1. send Nick Adams an e-mail about text processing-
  2. figure out what "Service Order Request" is. If it means maintenance, see if it's seasonal at all.
  3. Figure out what "coaches corporate credit cards" is
  4. Spend vs. quantity*unit_price...why are these different?

@juanshishido

  1. Come up with a hand-made word to category dictionary and see how this does
  2. Create a list of most probable words
  3. Use description along with manufacturer name
  4. Come up with a list of categories we want to use
  5. Split up word probabilities by department
choldgraf commented 9 years ago

@nlin3330: me and @juanshishido are meeting with Nick Adams in the D-lab today at 3:00pm to talk about text analysis and mining. Would you like to join us? It should be a useful conversation, and then we can chat about your project after this.

Also, a number of you guys have messaged me / others on the team via google hangouts...just another plug for moving that conversation to Slack, as it may make it easier to keep track of conversations in this project.

nlin3330 commented 9 years ago

That would work for me.

choldgraf commented 9 years ago

Great, I'll see you and @juanshishido at 3:00PM at BIDS.

Chris

dariusmehri commented 9 years ago

hey guys, i can't meet this friday, i can meet next friday, nick, can you get together over spring break to work on the graphs?

darius

On Wed, Mar 18, 2015 at 12:54 PM, Chris Holdgraf notifications@github.com wrote:

Here we will discuss updates from each team meeting. These will be:

@juanshishido https://github.com/juanshishido and me on Wednesday at 3:00pm in BIDS @kaiweitan https://github.com/kaiweitan as a personal update @nlin3330 https://github.com/nlin3330 and me on Thursday at 3:30pm in BIDS @dariusmehri https://github.com/dariusmehri as a personal update

Please post any and all progress, questions, and comments here, and I will update after the meetings.

— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/12.

Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley

choldgraf commented 9 years ago

hey @dariusmehri, meetings will be on Thursdays at 3:30pm from now on. Team meetings one week, full group meeting the other week (alternating). Same time and place for both. Is that an issue?

nlin3330 commented 9 years ago

I can't meet over spring break since I will be out of town but I will try to make some progress on the graphs. Sorry

dariusmehri commented 9 years ago

hi chris, i can meet on thursdays, darius

On Thu, Mar 19, 2015 at 11:13 AM, Chris Holdgraf notifications@github.com wrote:

hey @dariusmehri https://github.com/dariusmehri, meetings will be on Thursdays at 3:30pm from now on. Team meetings one week, full group meeting the other week (alternating). Same time and place for both. Is that an issue?

— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/12#issuecomment-83698085 .

Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley

choldgraf commented 9 years ago

@nlin3330 and @dariusmehri, you guys should remain in contact during Spring Break to coordinate the network analysis stuff. We are giving a short presentation on our progress the Monday after spring break, and we should have some preliminary results so that we get good feedback from the people at BIDS.

dariusmehri commented 9 years ago

sure, i just emailed nick about times to get together, darius

On Thu, Mar 19, 2015 at 11:26 AM, Chris Holdgraf notifications@github.com wrote:

@nlin3330 https://github.com/nlin3330 and @dariusmehri https://github.com/dariusmehri, you guys should remain in contact during Spring Break to coordinate the network analysis stuff. We are giving a short presentation on our progress the Monday after spring break, and we should have some preliminary results so that we get good feedback from the people at BIDS.

— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/12#issuecomment-83702534 .

Darius Mehri Ph.D. Candidate, Sociology University of California, Berkeley

choldgraf commented 9 years ago

Thanks - really looking forward to seeing the network project evolve. Results so far seem very promising.

juanshishido commented 9 years ago

Here is my update.

Started preprocessing text data; needs more work (e.g., removing some special characters). Have a list of the most probable words.

Pushed an update to the text_analysis.ipynb file.

To continue:

choldgraf commented 9 years ago

Looks great - looking forward to seeing the final product!

On Fri, Mar 20, 2015 at 12:44 AM, Juan Shishido notifications@github.com wrote:

Here is my update.

Started preprocessing text data; needs more work (e.g., removing some special characters). Have a list of the most probable words.

Pushed an update to the text_analysis.ipynb file.

To continue:

  • Split word probabilities by department
  • Hand-made word to category dictionary
  • Come up with a list of categories
  • Word frequency matrix (looked into this already. Scitkit-Learn has from sklearn.feature_extraction.text import CountVectorizer
  • Explore LDA
  • Put work in a Python script

— Reply to this email directly or view it on GitHub https://github.com/berkeley-dsc/purchasing/issues/12#issuecomment-83944198 .

choldgraf commented 9 years ago

Closing this - looking forward to seeing some pretty pictures that we can present on Monday.