Issue w. interpreting normalization measures in one-node networks (Version 2.0 beta 25 (2019-09-08))

leifeld / dna

Discourse Network Analyzer (DNA)

126 stars 41 forks source link

Issue w. interpreting normalization measures in one-node networks (Version 2.0 beta 25 (2019-09-08)) #248

Closed AkselSB closed 2 years ago

AkselSB commented 2 years ago

Dear Prof. Leifeld,

Working with DNA so far has been a great experience. However, my co-workers and I have issues interpreting the edge weights when an activity normalization-measure has been applied to the actor-network and congruence-network. We'd expect these to be scaled between 0 and 1 but instead these are exported in full numbers (see excerpt from csv file below, actors on both sides).

Properties of the exported network:

variable 1 & 2: Organization & concept
qualifier: agreement
qualifier aggregation: ignore
normalization: average activity

Can you help us interpret these numbers? And additionally, how can we rescale these egde weights to 0-1?

Thank you in advance!

leifeld commented 2 years ago

The result shouldn't consist of integer numbers. They don't necessarily have to be between 0 and 1, depending on the settings, for example when you don't remove duplicates, but they should still be small and have decimal places. Two initial thoughts: Does the problem also appear when you change the qualifier aggregation setting to congruence or subtract? Could the problem be that the spreadsheet software messes up the CSV file while a text editor would display them correctly? I'm happy to look at your database file (or a small excerpt that would reproduce the problem) if you can send it to me.

AkselSB commented 2 years ago

Thank you for the fast response!

I've followed your suggestions and two things come forward:

You were right that the spreadsheet ignores the decimal places. In the .csv file, a dot decimal is present, which we will correct for. The weights are indeed significantly lower than what they appear to be in the spreadsheet.
Secondly, even when changing the aggregation settings and removing duplicates, some of the weights are higher than 1 even when including activity normalization-measures.

However, from reading manual section 2.4 "Normalization for One-mode networks" again I see it says nothing about scaling between 0 and 1, so maybe this is not what we're looking for. Even so, this would be easier to interpret and present in a table form. So if at all possible, we would like to know how scale the weights. Do you think this is possible? I'll send you the database file, just in case this helps answer the question.

Once again, thank you so much for your support!

leifeld commented 2 years ago

OK, I'm glad this could be sorted out.

The average activity normalization simply doesn't standardize to one, as per Equation 2.8 in the manual for version 2.0 beta 25.

Compare this with the cosine similarity in Equation 2.10. The denominator multiplies the activity of both nodes instead of taking their mean (like in Equation 2.8). This is closer to what the numerator does, so I believe it should scale to one.

I think the Jaccard normalization in Equation 2.9 should also scale to one.

But you would need to try it out as I haven't paid much attention to the range of values.

AkselSB commented 2 years ago

We'll give this a try. Thank you for the help once again!