Open kisram opened 2 weeks ago
Problem/ Question: what to do with goods that have no subcategory? e.g. Animal food mentions in the 1730s, the sum of the subcategories doesn't match the main category count because there are 3 products in Animal food that do not belong to any subcategory Do we ignore the uncategorized goods for this chart or create some pseudo category after all (which they didn't want to do for the Grocery by Categories chart) ?
After discussing, we decided that there is no need for a pseudo category like "other". We'll add explanations to the charts - which we've been wanting to do anyway - so it will be clear what the numbers mean. This will also explain why the sum of the subcategory mentions might not match the main category mentions
@rausch-supola While updating the labels etc. I came across a new question. The normalized version of the chart doesn't make much sense to me anymore. Before, it was showing total number of docs / nr of docs the category was mentioned in, not counting multiple mentions. Now, if we want it to match the other charts, it would need to show total number of product mentions / number of times a product from tha category was mentioned, counting each mention separately, even if they were in the same document. While calculating the total number of documents per decade makes sense, calculating the total number of product mentions seems a bit odd to me.
So it would show the total number of product mentions of all products / number of times a product from a specific category was mentioned, right? I think we could nevertheless do it. At least, it tells us how often a category is mentioned in relation to all product mentions
We previously adjusted the way the numbers for the Groceries by Category chart are calculated, but the Category mentions over time chart does not yet follow the same logic. I.e. they still use the number of documents, not mentions, I think this should be consistent. Apparently I have forgotten to update the label of the y axis, it should say (Number of) mentions and not Number of Documents