themoonsheep / moonsheep

Moonsheep digitizes huge, messy paper and PDF archives through crowdsourcing and cutting edge technology.
http://moonsheep.org
GNU Affero General Public License v3.0
9 stars 3 forks source link

Optimize computing progress #138

Open KrzysztofMadejski opened 4 years ago

KrzysztofMadejski commented 4 years ago

How do we generally show progress: 1) Should it give an idea how much work has been started? (but not crosschecked?) 2) Should it give an idea how much work has been finished? (it is crossverified) Let's start with (2) and maybe in the future show also progress of those "in progress", which will have higher values.

Progress overall is average progress across all docs.

Progress of a document is an average total progress of all of tasks created on such document.

Own progress of a task is:

Total progress (with children) of a task is:

UI:

? How to compute that effectively?

  1. In some background job working from bottom up.
    1. create a tree of task types
    2. compute and cache progress for all tasks and all levels starting at the lowest
    3. compute progress for docs
    4. compute overall progress
  2. update those when value changes for a single task? Maybe yes to get instant feedback + do periodic update in a background run
  3. Can we build it incrementally (without first run)? Also yes

What are the operation? What are the costs? What tech stack should be used?

Storage:

Optimization strategies:

KrzysztofMadejski commented 4 years ago

Current situation is that on each entry total_progress of task and its parents is recomputed during request cycle.

Presentation in the UI: tasks-with-percentages