ai-se / ML-assisted-SLR

Automated Systematic Literature Review
2 stars 2 forks source link

Presumptive non-relevant examples #45

Open azhe825 opened 7 years ago

azhe825 commented 7 years ago

Newest result in e-discovery: Scalability of Continuous Active Learning for Reliable High-Recall Text Classification mentioned one technique to tackle the problem.

Presumptive non-relevant examples. Autonomy and reliability of continuous active learning for technology-assisted review

May be useful for REUSE.

Testing.

What

Each round, besides all the labeled examples, randomly sample from the unlabeled examples and treat them as negative training examples.

Then train the model.

Why

E-discovery

SLR

Results

FASTREAD, use this tech or not:

At least as good as not using it. (worst case result depends on pseudo random, not reliable)

Transfer learning result with this tech:

Hall as previous SLR,

Wahono as previous SLR,

Abdellatif as previous SLR,

Conclusions

timm commented 7 years ago

thanks for watching the ltierature

timm commented 7 years ago

"Each round, besides all the labeled examples, randomly sample from the unlabeled examples and treat them as negative training examples." i like the idea.

azhe825 commented 7 years ago

It is simple, but effective.

timm commented 7 years ago

important that you should stop soon walking circles in new land till you document the land you have visited. you need to get our 2 more papers, quick smart. high priority.

azhe825 commented 7 years ago

Sure. Please create a blank sharelatex project for me. I will fill in the rest.

timm commented 7 years ago

https://www.sharelatex.com/project/5884f514951819482c691de3