leifeld / dna

Discourse Network Analyzer (DNA)
126 stars 41 forks source link

Issue w. interpreting normalization measures in one-node networks (Version 2.0 beta 25 (2019-09-08)) #248

Closed AkselSB closed 2 years ago

AkselSB commented 2 years ago

Dear Prof. Leifeld,

Working with DNA so far has been a great experience. However, my co-workers and I have issues interpreting the edge weights when an activity normalization-measure has been applied to the actor-network and congruence-network. We'd expect these to be scaled between 0 and 1 but instead these are exported in full numbers (see excerpt from csv file below, actors on both sides).

image

Properties of the exported network:

Can you help us interpret these numbers? And additionally, how can we rescale these egde weights to 0-1?

Thank you in advance!

leifeld commented 2 years ago

The result shouldn't consist of integer numbers. They don't necessarily have to be between 0 and 1, depending on the settings, for example when you don't remove duplicates, but they should still be small and have decimal places. Two initial thoughts: Does the problem also appear when you change the qualifier aggregation setting to congruence or subtract? Could the problem be that the spreadsheet software messes up the CSV file while a text editor would display them correctly? I'm happy to look at your database file (or a small excerpt that would reproduce the problem) if you can send it to me.

AkselSB commented 2 years ago

Thank you for the fast response!

I've followed your suggestions and two things come forward:

However, from reading manual section 2.4 "Normalization for One-mode networks" again I see it says nothing about scaling between 0 and 1, so maybe this is not what we're looking for. Even so, this would be easier to interpret and present in a table form. So if at all possible, we would like to know how scale the weights. Do you think this is possible? I'll send you the database file, just in case this helps answer the question.

Once again, thank you so much for your support!

leifeld commented 2 years ago

OK, I'm glad this could be sorted out.

The average activity normalization simply doesn't standardize to one, as per Equation 2.8 in the manual for version 2.0 beta 25.

Compare this with the cosine similarity in Equation 2.10. The denominator multiplies the activity of both nodes instead of taking their mean (like in Equation 2.8). This is closer to what the numerator does, so I believe it should scale to one.

I think the Jaccard normalization in Equation 2.9 should also scale to one.

But you would need to try it out as I haven't paid much attention to the range of values.

AkselSB commented 2 years ago

We'll give this a try. Thank you for the help once again!