Multiple counting issue

sjsakib / cfviz

Visualizes user data from codeforces.com using the official API

http://cfviz.netlify.com

1.12k stars 70 forks source link

Multiple counting issue #28

Closed maskmanlucifer closed 3 years ago

maskmanlucifer commented 3 years ago

hey! Sakib i want to work on multiple counting issue of problems (total solved). I can reduce redundant counting to almost negligible level. This will come with little bit time cost but will work quite well. Should i work on that or someone else is already working on that ??.

sjsakib commented 3 years ago

No one is working on that. You can work on it. But can you share what approach you're going to take?

maskmanlucifer commented 3 years ago

For now i saw the major redundant counting problems are (these problems are caused by common problems between separate div 1 and div 2 round on same day)

On rating wise bar graphs solving same problem in both rounds increase counter by 2.
On level wise bar graph solving same problem increase count in different level (e.g. A,C) i don't think we have solution for this because we have to fix a reference point for solving this problem.
Double counting on solved.
Avarage Attempt , Max Attempts and Max AC will contain errors because let's say i solved a problem Div-1 A - 10 AC and after sometime i solved that with 20 AC in Div-2 C in this case Max AC will be 20 with current code but real value should be 30.

maskmanlucifer commented 3 years ago

Talking about solution. I messaged many problem setters regarding how organized contestId is on codeforces specialy (Div 1 and Div 2 same day) I also brute forced approx 25 these type of contests randomly and i found in 99% cases these rounds have consecutive contestId and answer of setters were also like there is a rare chance that there are (Div 1 and Div 2 contest same day and they don't have consecutive contest Id). Although there are some of the contests which are not consecutive. But this number is <=1%.

maskmanlucifer commented 3 years ago

I thought i can remove redundancy by not counting a problem if there is already a problem which have contestId = +1 or -1 of current problems contest id and same name. Basically using object map with key as (contestId,problem name) and writing some algortihm for performing other tasks.

maskmanlucifer commented 3 years ago

Now if we want our result more efficient we can change our map key to (contestid,problemname,problemrating). I tested multiple scripts using above two keys and results were quite good. I know map will consume little bit more time but that will surely increase out data accuracy. And you want more accurate answer we can include tags in our map that will be most efficient but most time consuming.

maskmanlucifer commented 3 years ago

Please give your thoughts over this. Should i start working on this ???.

maskmanlucifer commented 3 years ago

You can verify above points which i have referred and then tell me what is your decision or any flaws in my approach.

sjsakib commented 3 years ago

Okay, go ahead.