supermemo / SuperMemoAssistant

A companion app for SuperMemo 17-18 which extends its functionalities through plugins.
https://www.supermemo.wiki/sma/
MIT License
195 stars 20 forks source link

Computing the confidence of informations #93

Open alexis- opened 4 years ago

alexis- commented 4 years ago

ErgonomicFugitive> What you're asking for is bayesian probability analysis, though that's always going to be highly approximate when you don't have access to the data sets themselves. Short of that, learning to interpret various measures of effect size would pay off well. You also need to identify the particulars of the research methodology to be able to assess how well it will transfer to your own needs. I don't know how familiar you are with research methods, but knowing a wide range of methodologies makes scientific papers much more coherent and thus readable and eases extract selection. If a paper is still incoherent, your best bet is to look through the technical terminology in the paper and learn that deeply first. That should also help you build a causal model to assess transferability.

The best safeguard against irreproducible research is to switch from a black-and-white model of information accuracy to a bayesian probabilistic one. In such a system, your estimate for the likelihood of a hypothesis being true is adjusted proportionally to the volume * quality of data, which in turn is inversely proportional to the likelihood of retraction. I've been toying with the idea of including rough probability distributions in my items, but since I already do that in my head it's a lot of effort for little return.


AlexisInco> @ErgonomicFugitive Dealing with moving bodies of knowledge is problematic in SuperMemo. Your approach to estimating the confidence of a particular information (= item in SuperMemo) is interesting and is probably already one of the most efficient way to deal with this. The issue is, of course, how to minimize the cost of applying it ? And what information is worth the additional cost ?

I am starting to develop several project and plugin ideas dealing with research in SuperMemo. I think your idea could be taken further and optimized by automating certain steps:

alexis- commented 4 years ago

ErgonomicFugitive> A semi-automation of inference could be accomplished since probability-based statistics is largely algorithmic (especially compared to frequentist stats). A plugin that handles that could take all the inputs required for bayesian inference and run calculations accordingly. There would need to be some way to limit the CPU usage and duration of probability updates since they can sometimes grow way out of control, and it would be quite annoying for supermemo to lock up while it's running those calculations. For problems where not all inputs for bayesian inference are available, the principle of maximum entropy can be applied instead. If you want to get really ambitious, there are tools for automatically parsing natural language to find claims, as used in http://www.vldb.org/pvldb/vol8/p938-dong.pdf There's also a public release of a powerful natural language AI, GPT-2: https://github.com/openai/gpt-2

alexis- commented 4 years ago

Throttling option: https://github.com/tom-englert/Throttle.Fody