brownhci / drafty

Other
9 stars 4 forks source link

Possible Error: addToHM() - UserInterestService #114

Closed swallace21 closed 8 years ago

swallace21 commented 8 years ago

Line 188: public void addToHM(HashMap<String, Integer> hm, List list, int weight)

Does not return anything or modify a variable outside of that function.

swallace21 commented 8 years ago

Tracking down scope of variables passed into function. Might technically work, but difficult to trace and read through.

lucyvk commented 8 years ago

I think this is used as a utility function to add items to the hashmaps?

On Tue, May 17, 2016 at 11:13 PM, Shaun Wallace notifications@github.com wrote:

Tracking down scope of variables passed into function. Might technically work, but difficult to trace and read through.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/brownhci/drafty/issues/114#issuecomment-219914454

maubinle commented 8 years ago

Yeah this is pretty integral currently, as it is used to update the hashmaps each time a user makes an interaction or upon initial load. It basically just adds that thing to the hashmap with the correct weight if it is not already there, or updates the weight of it is. If there is a better way of doing this, feel free to change it, but the model really relies on this right now.

On Wednesday, 18 May 2016, Lucy Van Kleunen notifications@github.com wrote:

I think this is used as a utility function to add items to the hashmaps?

On Tue, May 17, 2016 at 11:13 PM, Shaun Wallace <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Tracking down scope of variables passed into function. Might technically work, but difficult to trace and read through.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/brownhci/drafty/issues/114#issuecomment-219914454

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/brownhci/drafty/issues/114#issuecomment-219916184

Marianne Aubin Le Quéré Brown University '17 Director and Co-Founder of Folkmade http://www.folkmade.org Facebook http://facebook.com/folkmadeprovidence // LinkedIn http://linkedin.com/in/maubinlequere

swallace21 commented 8 years ago

Thanks Marianne. Yeah I can see why it is used. However the scope of the function makes it impossible to debug or monitor what it is modifying. I have rewritten and cleaned up code in this class. There were several small errors. It appears to be working okay now. However it is difficult to be 100% bc of how everything was constructed. There still appears to several sections that might contain inconsistencies.

For example why does _fieldInterest contain JoinYear, Rank, and Subfield data? For example there are other places in the code where _yearInterest is used.

What would really help and needs to be done, is a formal written definition of the user interest model. This is necessary for the CHI paper submission as well. It will also make bug checking much easier. We had discussed you writing this before. Can you put this together over the next 1-2 days? Write it as if it is being included in a paper submission.

I am on vacation all next week and will be gone on my anniversary for a few days. I'd really like not to work over my vacation. Since this is probably the only week off I will have until late December.

I will continue checking and optimizing what I see. As well as integrating everything into the global API structure and experiment 1 code.

maubinle commented 8 years ago

Hi Shaun, I'm sorry there were small errors — that's why I needed another set of eyes to look at the code. I finished my exams today and can debug a little more before this Saturday. Have your changes been pushed so I can edit further?

Also, it seems like you're a little frustrated with the code. Can you elaborate on what you mean by it being difficult to debug "because of how everything was constructed"? I thought I stuck to the plan pretty well and am curious to know where/if I am going wrong.

The formal definition is actually already done. It's in the wiki: https://github.com/brownhci/drafty/wiki/Algorithm-Description Let me know if this is what you meant, that's what we had discussed in our meeting. It's not currently geared for a paper though, so I can definitely change that if necessary.

swallace21 commented 8 years ago

Hey Marianne. I will get it up and running. My goal right now is to have it ready for Monday. It will be easier if I just concentrate on finishing it off.

BTW when I call printInterest() it prints no interest for these: no bach no mast no doct no postdoc no gender

This goes back to a question I asked before. Why is everything, not a university name, only being stored in the _field hashmap? That doesn't match the first paragraph in the description of the model?

Once I get everything integrated with the experiment I will push everything to the dev branch again.

There was a ton of dead code. Classes and variables passed around that did nothing. I had the code examples and explained how using the global variables and functions to access everything would be much simpler. It would create a cleaner result.

Using "this." everywhere makes really hard to determine the scope of variables and functions being modified. Depending how it was used

I realize writing re-usable classes and functions on this level is difficult and takes some experience to do it right. Taking the data structures class or the software engineering class would help with this. In general don't feel too bad, these are skills that students usually develop in internships or during the their first 1-2 years as a developer. It can also be difficult because every team handles the Object Oriented soup differently.

maubinle commented 8 years ago

If it prints no interest for those, that probably just means you haven't interacted with something in that column directly. Try clicking on a cell for each and see if that changes things. At least, it works fine on mine once I have interacted with all of those columns.

There does seem to be an issue right now where things like JoinYear and PhotoURL are being added to the _profs hashmap. I think Lucy(?) changed the way that this was done to account for row values, so I am not as familiar with that code.

I'm sorry, I don't understand what you mean by it only being stored in the _field hashmap. This is what I get after using the system for a bit:

screen shot 2016-05-20 at 4 36 59 pm

So everything seems to be fine on my end, other than that the JoinYears for me are recording as prof names. This was working before and it seems to be the only problem on mine, so I'm sure this is an easy fix.

swallace21 commented 8 years ago

I fixed the JoinYears popping up in profs. I am getting bach, mast, doct, to show now.

The _field interest map on mine shows everything that is not a university: field interest hm is {=4, Real-Time & Embedded Systems=2, Computer Education=1, Algorithms & Theory=1, 2012=1, 2001=4, Data Mining=3, 2000=9, 2009=2, 1987=4, 1998=4, 2008=3, 1996=4, Networks & Communications=2, 2006=1, Machine Learning & Pattern Recognition=1, 2005=1, 2004=17, 1981=1, 2002=1, 2013=1, 1991=2, Natural Language & Speech=1, Security & Privacy=4}

I pushed my code to the dev branch. Still needs more testing. I'd highly recommend downloading that branch as a separate project. Or back-up your code first. There are some big changes to the architecture under way.

The issue with field interest needs to be sorted out. The code that I pulled from master branch has always shown the field interest like that. Also most of the functions in printInterest() except for prof, uni, and field were commented out in that code. Look at the bottom here: https://github.com/brownhci/drafty/blob/master/src/main/java/drafty/services/UserInterestService.java

It appears that is the last obvious bug to fix. I do have things from my notes to go over for the experiment structure that I need to check.

maubinle commented 8 years ago

Ok, I am glad we are coming to a solution. I don't think I can help with the field Interest though as I am not getting this issue.

It was commented out because they were just there if we wanted to use them, but after some time I didn't want it clogging up the screen -- that way we can print just the ones we need right now.

lucyvk commented 8 years ago

Just pulled the dev branch. I see the join year issue - looks like "JoinYear" was saved as "Join Year" in the switch statement in the getHashMap method, which caused that method to return the professor hashmap (the default).

I don't see the field issue though.

On Fri, May 20, 2016 at 5:08 PM, maubinle notifications@github.com wrote:

Ok, I am glad we are coming to a solution. I don't think I can help with the field Interest though as I am not getting this issue.

It was commented out because they were just there if we wanted to use them, but after some time I didn't want it clogging up the screen -- that way we can print just the ones we need right now.

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/brownhci/drafty/issues/114#issuecomment-220718933

lucyvk commented 8 years ago

I also noticed that the dev version still has the bug where some values get added to the hashmaps too many times:

if(doubleClick) { this.addToHM(this.getHashMap(cell_column), spec, _dclick); } else { this.addToHM(this.getHashMap(cell_column), spec, _click); } this.addToHM(this.getHashMap(cell_column), spec, _click);

the last "this.addToHM" should be deleted.

On Fri, May 20, 2016 at 7:59 PM, Van Kleunen, Lucy < lucy_van_kleunen@brown.edu> wrote:

Just pulled the dev branch. I see the join year issue - looks like "JoinYear" was saved as "Join Year" in the switch statement in the getHashMap method, which caused that method to return the professor hashmap (the default).

I don't see the field issue though.

On Fri, May 20, 2016 at 5:08 PM, maubinle notifications@github.com wrote:

Ok, I am glad we are coming to a solution. I don't think I can help with the field Interest though as I am not getting this issue.

It was commented out because they were just there if we wanted to use them, but after some time I didn't want it clogging up the screen -- that way we can print just the ones we need right now.

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/brownhci/drafty/issues/114#issuecomment-220718933

lucyvk commented 8 years ago

Oh wait, now I see the subfield issue. It somehow happens after you fix the joinyear issue. I'll look into it too!

lucyvk commented 8 years ago

I see why!

rowValues isn't being instantiated propertly in the Profs file.

Right now it is:

String rowValues = (String) e.getItem().getItemProperty("FullName").getValue()+","+ (String) e.getItem().getItemProperty("University").getValue() +","+ (String) e.getItem().getItemProperty("JoinYear").getValue()+","+ (String) e.getItem().getItemProperty("Rank").getValue()+ ","+ (String) e.getItem().getItemProperty("Subfield").getValue()+","+ (String) e.getItem().getItemProperty("Bachelor").getValue()+","+ (String) e.getItem().getItemProperty("Doctorate").getValue()+","+ (String) e.getItem().getItemProperty("PostDoc").getValue()+","+ (String) e.getItem().getItemProperty("Gender").getValue();

It should just be the name, university, and subfield:

        public void itemClick(ItemClickEvent e) {
            //System.out.println("Click Name: " + (String) e.getItem().getItemProperty("FullName").getValue());
            String rowValues = (String) e.getItem().getItemProperty("FullName").getValue()+","+
                    (String) e.getItem().getItemProperty("University").getValue() +","+
                    (String) e.getItem().getItemProperty("Subfield").getValue();
swallace21 commented 8 years ago

Great work Lucy!

In our last meeting I thought we decided rowValues = all the values for the row clicked on. (except: PhotoURL, Sources)

Only including FullName, University, and Subfield does not appear to match every use case. Plus we are leaving out valuable data that could be of use in the future. The question that needs to be answered is why are only FullName, University, and Subfield included in rowValues?

For example. I am a 40 year old Professor who received my masters at Princeton. I currently teach at Georgia Tech in Computer Vision. I filter the Masters column by Princeton. And then I scroll around and look to see where some of my classmates did their PhD. I am curious where they might be teaching now. Did anyone recently begin a faculty position at a different University? Maybe I see some of their information is incorrect. For example I know my friend Hua did her Bachelors at Berkeley, but Drafty says she did it at UCLA. So I make a new suggestion that she did her Bachelors at Berkeley.

Recording the first 10 columns for each click gives us a clearer picture of their intent. Also because the rowValues array is always in the exact same order it makes sense to include the values from all 10 columns. As long as we have all of the data we can organize and analyze it later on however we want. Always best to gather as much data as possible initially.

Once we have a large set of interaction data that is time stamped we can run further analysis in the future. This will allow us to gain insight into people's sequences of interactions and how it influences their intent. If we find significant results in this area it will be a great finding to include it in the CHI paper.

lucyvk commented 8 years ago

Ok, for now the clicks only add values for those three columns but we can change to record more info! I think that those three are the most important but I can see the argument for recording joinyear and the other universities as well.

I'm out of town this weekend but I can add more columns Sunday night if you want - should be pretty straightforward, just following the same pattern both for initializing the hm's from past interactions and for adding new info.

On May 21, 2016, at 11:50 AM, Shaun Wallace notifications@github.com wrote:

Great work Lucy!

In our last meeting I thought we decided rowValues = all the values for the row clicked on. (except: PhotoURL, Sources)

Only including FullName, University, and Subfield does not appear to match every use case. Plus we are leaving out valuable data that could be of use in the future. The question that needs to be answered is why are only FullName, University, and Subfield included in rowValues?

For example. I am a 40 year old Professor who received my masters at Princeton. I currently teach at Georgia Tech in Computer Vision. I filter the Masters column by Princeton. And then I scroll around and look to see where some of my classmates did their PhD. I am curious where they might be teaching now. Did anyone recently begin a faculty position at a different University? Maybe I see some of their information is incorrect. For example I know my friend Hua did her Bachelors at Berkeley, but Drafty says she did it at UCLA. So I make a new suggestion that she did her Bachelors at Berkeley.

Recording the first 10 columns for each click gives us a clearer picture of their intent. Also because the rowValues array is always in the exact same order it makes sense to include the values from all 10 columns. As long as we have all of the data we can organize and analyze it later on however we want. Always best to gather as much data as possible initially.

Once we have a large set of interaction data that is time stamped we can run further analysis in the future. This will allow us to gain insight into people's sequences of interactions and how it influences their intent. If we find significant results in this area it will be a great finding to include it in the CHI paper.

— You are receiving this because you commented. Reply to this email directly or view it on GitHub

swallace21 commented 8 years ago

No worries, I had already made some of the modifications already. Just finished off modifying the UserInterestService for rowValues.

Wrote about it in-depth here: https://github.com/brownhci/drafty/issues/117

swallace21 commented 8 years ago

Solved, closing.