Rostlab / JS16_ProjectB_Group7

Game of Thrones characters are always in danger of being eliminated. The challenge in this assignment is to see at what risk are the characters that are still alive of being eliminated. The goal of this project is to rank characters by their Percentage Likelihood of Death (PLOD). You will assign a PLOD using machine learning approaches.
GNU General Public License v3.0
1 stars 1 forks source link

PLOD Rankings? #5

Open mammuth opened 8 years ago

mammuth commented 8 years ago

Are you implementing something so we can show e.g. the Top 3 characters most likely to die? on the page provided by ProjectF?

cc @jorjo1 @yashha

dan736923 commented 8 years ago

See #6. We will only provide the PLODs to Project A.

sacdallago commented 8 years ago

Hey, hey, hey.

visualize the PLODs

is still your task. You should at least think of how to visualize them in D3 and provide F with an example

mammuth commented 8 years ago

Well actually, this issue is about a topX-ranking. If we can get a top3-ranking by the database from A, we would visualize those three characters by ourselves.

If Project B needs to make something to be visualized, "PLOD over time for character X" would be probably the best thing? Is this something which is possible?

sacdallago commented 8 years ago

Not really. The predictions are stored and the prediction machine should be trained again at some stage with more data and the PLODS recalculated. I would say this can be done on a per-season basis, so out of the scope of the seminar for now.

As for the viz: they should at the very least help you building these visualizations and by example I literally mean: call the API of A and render the graph, so that you can just copy-paste the code and change the colors of the graph to be consistent with your palette

mammuth commented 8 years ago

Maybe I'm not up to date, but what's the graph you're talking about @sacdallago ?

The thing we (at least I :grin:) was talking about in this issue is this:

screen shot 2016-03-17 at 19 06 06
sacdallago commented 8 years ago

Yeah but it would also be nice to have a complete ranking (of all characters, or maybe just the important ones) and that needs some thought! :)

k-angelo commented 8 years ago

For that to happen we need a new API call from team A that returns the top10 otherwise its not possible. As for helping we are currently in a feature-race that was very suddenly announced mid-week. It was also my intention to provide an example of visualization for team F.

Since our primary issue is to make results better we can reopen this issue after 21st when we deliver our results. Basically our idea was a simple bar chart using d3.js. If you had something else in mind for the UI tell us your idea.

sacdallago commented 8 years ago

I'm just suggesting, you are free to do what you want! P.S.: if you get a call where you get all characters (which is already the case) with all of their info (so in the future, also the PLOD) you can infer the ranking by sorting the data out meaningfully.

k-angelo commented 8 years ago

Yes we get all the characters but : Do we really want to make the frontend and on a central page such as the rankings download the information of all characters 2k + and then parse and infer the results ? I dont think it is a nice idea.

goldbergtatyana commented 8 years ago

Hi Kostas, by default we should be showing only top ten candidates (only those ten with the highest PLOD) and then allow users to enter names of any other 2k+ characters to show their PLOD. Btw, the list of top 10 should only include meaningful (& interesting) characters. If it happens that by any chance you predict PLOD for "house of something", then pls pls make sure it doesn't appear in the list of top 10...

k-angelo commented 8 years ago

For users asking a name it makes sense to be on demand. But for the top10 I have a rather stupid question. What qualifies a character as popular. I agree it makes sense to have only the most popular characters appear there. So it comes to this : a) Since PLODs are static => top 10 is static. We can see the top characters and pick the top 10 ourselves based on the known characters and provide them via a JSON that can be called separately but that is certainly a strange way to do it.

b) Sure we can filter "house of","kingdom of" etc and apply a rule set that excludes this in the top 10. This doesnt prevent unpopular characters e.g. 1 time mentioned human character to creep through the top 10.

c) Using popularity a.k.a page rank to dynamically calculate from our static PLODs the characters shown but then it becomes more complicated e.g. what is the popularity threshold to appear in comparison to a strong death possibility.

So what approach you think is suitable given the time frame?

sacdallago commented 8 years ago

For point c: Aren't you already calculating a centrality measure to feed the ML device with this new feature? Once you have calculated the centralities, you can just save them somewhere and map characters to centrality. ACTUALLY, it would be damn nice if you could provide the centrality calculation mechanism to A, so that they can integrate it with the database. This would then allow to "filter by popularity" similarly as you would filter by house, kingdom,...

sacdallago commented 8 years ago

CC @Adiolis @kordianbruck

sacdallago commented 8 years ago

P.S.: an I don't simply mean (this time) a method call to get the centrality based on the name, but the actual function that takes the in-/out-bound links for a character and creates the whole character-graph and then assigns betweenness or degree centrality, whatever

goldbergtatyana commented 8 years ago

@konstantinos-angelo i find point a to be a very very good suggestion!

k-angelo commented 8 years ago

I cant make promises about that christian and we can surely do point A easily and strive for something better if time allows it.

sacdallago commented 8 years ago

Well. I'm just talking about awesome stuff, you are talking about doable :+1: I'm in for you can make it happen! @AlexMoroz thought he needed 6 days to do integration for C10 3 days ago, and now he has already finished