jerinphilip / ocr-retrain

0 stars 0 forks source link

Active learning pipeline #12

Open Deepayan137 opened 7 years ago

Deepayan137 commented 7 years ago

Hey Jerin I was going through some lectures and tutorials based on active learning. I have prepared a small ppt based on my understanding of active learning and how we should adapt it for our purpose. Can you just briefly go over them once, and let me know if I am missing something or if you would like to make any suggestions. I am providing the link below. Active Learning ppt

jerinphilip commented 7 years ago

Hello, I went through the PPT. I see you want something Word2Vec/DL related. Discussions like this, let's shift to mail and the development discussions here.

Coming back to the subject, there are many feasibility issues in getting such a system ready - one primary thing being lack of data. I'll try to compile more together and send it to you in a while. For now, think of building the system and making it DAS ready. Feel free to do your experiments in your free time. If it's merge ready by the end, we can integrate it. I think we should scale down a bit. I hope you understand. The reason we get shot down everytime by sir is concern on the data limits.

Our primary concern should be :

  1. Build the pipeline first with naive techniques, get them reasonably working. Replace the modules later with the same signature.
  2. Integrate the cost model into the actual simulation to make the cost more deterministic and quantitative. For now I'm averaging and extrapolating using those.
jerinphilip commented 7 years ago

tl;dr:

Deepayan137 commented 7 years ago

Is there a way to split th

On Aug 7, 2017 10:18 AM, "Jerin Philip" notifications@github.com wrote:

tl;dr:

  • Do the things in your free time. I can lend hands if I'm free as well.
  • Prioritize based on the target. We need one paper in DAS.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jerinphilip/ocr-retrain/issues/12#issuecomment-320567987, or mute the thread https://github.com/notifications/unsubscribe-auth/AUKZLr5FMUSkqSLaR8xmIU4qS4QPhV5Xks5sVpcngaJpZM4OusQZ .

Deepayan137 commented 7 years ago

Is there a way to remove the suffix part of a word and get the root part. If yes, then it could serve as a valuable attribute for finding the decision boundary.

On Aug 7, 2017 10:26 AM, "Deep ." deepayan137@gmail.com wrote:

Is there a way to split th

On Aug 7, 2017 10:18 AM, "Jerin Philip" notifications@github.com wrote:

tl;dr:

  • Do the things in your free time. I can lend hands if I'm free as well.
  • Prioritize based on the target. We need one paper in DAS.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jerinphilip/ocr-retrain/issues/12#issuecomment-320567987, or mute the thread https://github.com/notifications/unsubscribe-auth/AUKZLr5FMUSkqSLaR8xmIU4qS4QPhV5Xks5sVpcngaJpZM4OusQZ .