Project report doubt issue thread for coaching session 25/11

Doubts Content Iteration One

High Priority

In "Results of Data Translation" Phase: Most of our field mappings are 1-to-1 translations, split based string to "list" conversion. So, we can just create a new column in the integrated schema for explanation, instead of a detailed write up.
"How did you create the consolidated schema?": We did it via Pandas, so we are good w/ that, & we can add some explanations for it.
Our code currently is in scrambled form with multiple main classes specific to each task, so would the code in the format be fine. or should in the final iteration consolidate it.---
"Group size distribution:" cluster size 1 and 2 means, is it after combining d1+d2+d3, -> cluster size 1 ideally because unique movie level data.
"How should we interpret the 'consistency' metric in data fusion stage ?"

Medium Priority

What's the well defined scope of "something cool" ? Is there a language, framework or methodology restriction in it? Are there extra marks for it? [provided everything else is correct.]
"overall accuracy": of the final dataset meaning exactly what?
We have to make a presentation or we can do a walk-through of report tables? What's the content limit on that?
Ownership has to be added only in presentation or report too.

Very Low Priority

Early rough draft feedback is possible before the deadline, let's say in the next coaching session? Whom should the report be exactly addressed to? as the template said dws group?

TODO

"Error Analysis": We add analysis comparing most promising approaches too.
"Gold Standard": Finding suitable size for the gold standard for evaluation.

Humorloos / IE683

Project report doubt issue thread for coaching session 25/11 #55

Doubts Content Iteration One