statnet / COVID-JustOneFriend

GNU General Public License v3.0
1 stars 2 forks source link

switch from 3-paths to average distance between reachable nodes? #3

Closed sgoodreau closed 4 years ago

sgoodreau commented 4 years ago

Martina suggested this, and it's a good idea. Seems likely to be much more intuitive to people - it matches the "degrees of separation" concept from the real world.

My (Steve's) only concern is that the difference won't be nearly as great between the essential network and the friend network. I just spent a few minutes pondering it and realizing why. The friend network adds a lot more people into the pool of the reachables, meaning that overall there may still be a lot of folks who are reachable along long paths (and weren't reachable at all before). So the average might not change as much as we think. But, the only way to know is to try. Martina volunteered to explore this, but she is also a busy bee so anyone else who wants to take a stab and see what the numbers look like, go for it!

dth2 commented 4 years ago

I will take a stab.

EmilyPo commented 4 years ago

I’m interested to see these results, but I’ve been thinking about it and while I agree that “reachable nodes” is an easily understandable concept, I am worried that we will stray too far into the realm of “tragedy of the commons” in terms of public messaging.

Especially if the reachable set is still large across scenarios (which I imagine it will be), then we might actually encourage people to stray from the social distancing guidelines by either thinking that “oh well the reachable set is large anyway” or “that idea assumes there are no breaks in transmission, and there surely will be, but it doesn’t have to be me”.

The 3-step, while perhaps less easily understandable why we would chose this distance, is more local, and immediate to the reader. It’s something they/their household could easily influence.

Em

Sent from my iPhone

On Apr 7, 2020, at 7:51 PM, dth2 notifications@github.com wrote:

 I will take a stab.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

dth2 commented 4 years ago

The other issue that I am now looking at as I do this is that the average distance between nodes can only be calculated across connected nodes so it now requires an addition level of consideration for the reader. An increase in average distance from 3 to 5 is very different when the entire network is still connected Vs when half the population is now isolates and the distance between what remains is 5. The count of 3-paths sort of captures both the increase in distance between connected nodes and the effect of isolates at the same time.

Given this is there still interest is seeing the mean geodesic?

netterie commented 4 years ago

Those are both really good points considering this is for a broad audience.

I was really curious about this so I went ahead and looked at the geodesic distance distributions - I think! Check my code below, if you have time, I would appreciate it!

I used distance = 0 to indicate when there is no path. It confirms what we would expect.. I find my eye tracking the mode and the spread. I'm pretty sold on the Essential network :)

image

# Geodesics - run after running Steve's code
library(sna)
geo.table <- function(net, netname) {
  # Take upper triangle of geodesic matrix, and
  # then make a table of the distances
  geo.mat <- geodist(net, inf.replace=0)$gdist
  df <- data.frame(table(geo.mat[upper.tri(geo.mat)]))
  # Make sure the distance is recorded as numeric
  # not character
  numeric.dist <- 0:20
  names(numeric.dist) <- as.character(0:20)
  df$Var1 <- numeric.dist[names(numeric.dist)%in%df$Var1]
  # Add column title and network name
  colnames(df)[1] <- 'Geodesic Distance'
  df$Network <- netname
  return(df)
}

# Get all tables into one data frame
geo <- rbind(geo.table(net.precov, 'Pre-Covid'),
      geo.table(emptynet, 'Pure Isolation'),
      geo.table(net.essl, 'Essential Only'),
      geo.table(net.comb.1, 'Just One Friend'),
      geo.table(net.comb.2, 'Just One Friend Per HH'))

# Order networks
net.order <- c('Pre-Covid', 'Pure Isolation', 
               'Essential Only', 'Just One Friend', 
               'Just One Friend Per HH')
geo$Network <- factor(geo$Network, levels=net.order,
                      labels=net.order)

# Plot
ggplot(geo, aes(x=`Geodesic Distance`, y=Freq, fill=Network)) +
  geom_bar(stat='identity', position=position_dodge())
dth2 commented 4 years ago

If we ignore the issue with isolates and just focus on the distance between connected households:

The starting network: 2.18 Just the essential workers: 5.39 Just two friends: 4.67 Just one friend: 6.16

Note that the distance is longer for just 1 friend than the essential worker network but the number of connected households is much higher. This may be harder to message than the 3 paths.

If this seems fruitful let me know.

sgoodreau commented 4 years ago

OK, so average distance doesn't seem like the way to go after all. Reachable vs unreachable tells a clear story - which is largely but not fully covered by the largest component - but conditional on reachable, distance doesn't tell a clear story after that.

Thanks Deven - I'm adding your name on now, even though this didn't make it in.

Open to additional ideas for metrics.

In the meanwhile, I do like Emily's take on it, and I'm going to tweak the explanation a bit. There is language down at the end about the kind of ways that one might infect someone three degrees away from them unknowingly - a distance that is close enough that you can conceive of them in terms of concrete relationships, but far enough that you can imagine not knowing that you've infected that person. I will add a bit more language like that up front when I first mention 3-paths to make it a bit more concrete.

sgoodreau commented 4 years ago

Just had a call with @martinamorris about other things but we touched on this. After we hung up, I think I see a middle way. She is most interested in seeing the language of "degrees of separation" because it's familiar to people. So instead of average distance, how about the total number of pairs that are 3 degrees of separation or less? it still has the arbitrariness of 3, but at least it uses the "degree of separation" idea. Note that this isn't the same as the 3-paths, even though there is some overlap. Two HH can be at the ends of a 3-path, but also be connected directly so have 1 degree of separation. Paths accumulate up quickly (many 3-paths in a 6-path) while these pairs don' in the same way (somewhat like stars vs degree). So, anyone up for calculating these numbers? @netterie? @dth2? @EmilyPo? Others?

In the end, I think that for most audiences the network figures tell most of the story anyway and all of this is detail. And with the pages now split up into tabs, that gets emphasized so much more clearly than before. But it is still worth getting right!

martinamorris commented 4 years ago

To me, that graph above is great (I think I'd facet the different distn's rather than plot them together).

I get that the reachability is limited to the giant component, and this adds some conditional logic, plus the need to report 2 numbers: component size (everybody here can reach everybody else), and geodesic distribution within the component. I just think 3-paths are much more complicated, given their embedding and the difference between 3 links and 4 nodes.

I think the point I'm trying to find a way to articulate here is the difference between your individual view of the network that puts you at risk, and the actual risk. Direct vs. indirect connectivity. I think that's what we're all trying to find a way to express, simply.

Longer geodesics are exactly the point -- it used to be you had 15 possible people who could directly infect you (pre-Covid) because they were 1 step away, another large number could get you in 2 steps, no one was more than 3 steps away. Many of these 2-3 step folks you could see as well. There are no long, invisible geodesics. So what you could see was a good proxy for your risk.

But now, when you have just one friend, your direct risk looks minimal, even if you add your friend's friend. Meanwhile, the long geodesics tell another story. In this case, your local view of risk doesn't begin to capture your indirect risk -- it underestimates it significantly.

I have to work on the backcalc now, so can't play with this. But if anyone is inspired, please do.

netterie commented 4 years ago

@sgoodreau Am I understanding correctly that you're curious about the sum of 1, 2 and 3-paths?

(paths.precov <- kpath.census(net.precov, mode='graph', tabulate.by.vertex=FALSE)$path.count)
     1      2      3 
  1567  24486 380880 

sum(paths.precov) [1] 406933 sum(paths.essl) [1] 422 sum(paths.comb.1) [1] 4066 sum(paths.comb.2) [1] 1236

martinamorris commented 4 years ago

I don't think you can sum these. 1paths are embedded in 2paths are embedded in 3paths. so this is double and triple counting some of these paths

sgoodreau commented 4 years ago

@netterie, alas no - it's the number of HH pairs that lie are 3 degrees of separation or less from each other. That's not calculable directly from the path distribution. But it should be just as easy -- i.e. the sum of the bars at x=c(1,2,3) in your geodesic distance diagram above.

netterie commented 4 years ago

Oh, I see!

Here is what I calculated for the geodesics 1, 2 and 3. I also checked the mean. Deven, I got slightly different numbers. Are you sure you excluded the isolates? Otherwise, can you check my code and see if you find an error? See below.

------------ Pre-Covid -------------

Total dyads: 19900 Mean geodesic is 2.19 Sum of 1,2, and 3 is 19900 or 100 % of dyads

------------ Pure Isolation -------------

Total dyads: 19900 Mean geodesic is 0 Sum of 1,2, and 3 is 0 or 0 % of dyads

------------ Essential Only -------------

Total dyads: 19900 Mean geodesic is 5.73 Sum of 1,2, and 3 is 422 or 2.12 % of dyads

------------ Just One Friend -------------

Total dyads: 19900 Mean geodesic is 4.55 Sum of 1,2, and 3 is 3517 or 17.67 % of dyads

------------ Just One Friend Per HH -------------

Total dyads: 19900 Mean geodesic is 6.48 Sum of 1,2, and 3 is 1197 or 6.02 % of dyads

geo.table <- function(net, netname) {
  # Take upper triangle of geodesic matrix, and
  # then make a table of the distances
  geo.mat <- geodist(net, inf.replace=0)$gdist
  df <- data.frame(table(geo.mat[upper.tri(geo.mat)]))
  # Make sure the distance is recorded as numeric
  # not character
  numeric.dist <- 0:20
  names(numeric.dist) <- as.character(0:20)
  df$Var1 <- numeric.dist[names(numeric.dist)%in%df$Var1]
  # Print mean and sum of 1, 2 and 3
  df_nozero <- subset(df, Var1!=0)
  geo.mean <- sum(prop.table(df_nozero$Freq)*(1:nrow(df_nozero)))
  sum123 <- sum(subset(df_nozero,
                       Var1%in%c(1,2,3))$Freq)
  cat('\n\n------------',netname, '-------------\n')
  cat('\nTotal dyads:', sum(df$Freq), '\n')
  cat('Mean geodesic is', round(geo.mean,2), '\n')
  cat('Sum of 1,2, and 3 is', sum123, 'or ', round(100*sum123/19900,2), '% of dyads\n\n')

  # Add column title and network name
  colnames(df)[1] <- 'Geodesic Distance'
  df$Network <- netname
  return(df)
}

# Get all tables into one data frame
geo <- rbind(geo.table(net.precov, 'Pre-Covid'),
      geo.table(emptynet, 'Pure Isolation'),
      geo.table(net.essl, 'Essential Only'),
      geo.table(net.comb.1, 'Just One Friend'),
      geo.table(net.comb.2, 'Just One Friend Per HH'))
dth2 commented 4 years ago

I think the difference may just be in the random probabilities of ties in the ERGM simulation.

martinamorris commented 4 years ago

Ok, I think this is getting somewhere. It just occurred to me that the idea of "social distance" is to increase distance (doh). Which is what the interventions above are doing. So, they're working.

BUT, component/cluster size is still an important additional indicator (we were always going to use this, regardless of whether we also reported the path or geodesic).

So, what this means is that:

  1. Social distancing works: it reduces the "network connectivity" which we define here as the number of reachable households in the cluster

  2. But social distancing also makes it harder to see the connectivity that remains: the HH you're connected to are now 5-6 degrees of separation away, so they're invisible to you. But they can still reach you, and you can reach them.

sgoodreau commented 4 years ago

Ok, I think this is getting somewhere. It just occurred to me that the idea of "social distance" is to increase distance (doh). Which is what the interventions above are doing. So, they're working.

BUT, component/cluster size is still an important additional indicator (we were always going to use this, regardless of whether we also reported the path or geodesic).

So, what this means is that:

  1. Social distancing works: it reduces the "network connectivity" which we define here as the number of reachable households in the cluster
  2. But social distancing also makes it harder to see the connectivity that remains: the HH you're connected to are now 5-6 degrees of separation away, so they're invisible to you. But they can still reach you, and you can reach them.

Ok, I think this is getting somewhere. It just occurred to me that the idea of "social distance" is to increase distance (doh). Which is what the interventions above are doing. So, they're working.

BUT, component/cluster size is still an important additional indicator (we were always going to use this, regardless of whether we also reported the path or geodesic).

So, what this means is that:

  1. Social distancing works: it reduces the "network connectivity" which we define here as the number of reachable households in the cluster
  2. But social distancing also makes it harder to see the connectivity that remains: the HH you're connected to are now 5-6 degrees of separation away, so they're invisible to you. But they can still reach you, and you can reach them.

I think I see where you're getting at with point #2. That said, it's still different than the seond point I've been trying to get across. That may explain why our metrics of interest aren't lining up.

For me, the two points are driven by the fact that COVID, like most pathogens, doesn't transmit at anything like 100% probability. So being in the same component is a first-line measure of transmission potential (that's our measure 1)--but it's not anything like a guarantee. But being connected by lots and lots of short- to- medium- length paths (measure 2) is a whole other domain of potential. This is what I have been trying to articulate. And it is totally different in the friendship network than just the essential network.

The presence or absence of long paths above and beyond that also tells us something, but I think of that as less important than these first two, in terms of a direct measure of transmission potential. And I don't think we want to do more than two measures here -it's complicated enough for people.

martinamorris commented 4 years ago

I think the reason the "low prob of transmission" matters less to me here is that these are repeated contacts, over a long enough period, that it raises the overall probability.

The reason I think the long paths are important is that it explains to people why their intuition is wrong. They can't see out that far -- no one can, but the virus can...

netterie commented 4 years ago

@martinamorris @sgoodreau

I modified my prior code to tabulate how many k-paths each node is in. Here's what I got for the average (computed including zeroes):

------------ Pre-Covid -------------

Total nodes: 200 Nodes on average are in 7617.6 3 -paths

------------ Pure Isolation -------------

Total nodes: 200 Nodes on average are in 0 3 -paths

------------ Essential Only -------------

Total nodes: 200 Nodes on average are in 2.74 3 -paths

------------ Just One Friend -------------

Total nodes: 200 Nodes on average are in 56.8 3 -paths

------------ Just One Friend Per HH -------------

Total nodes: 200 Nodes on average are in 13.54 3 -paths

Unfortunately I was not able to complete a histogram for you, because we need to bin the number of 3-paths to make a meaningful plot. I apologize, I don't have time to do that right now and I'm not exactly sure when my next opportunity will be! I hope it helps to have the code, below.

Code

# Number of k-paths nodes are in
kpath.table <- function(net, netname, knum=3) {
  # Get k-path count by node (default gives 1,2 and 3 paths)
  # Remove the "Aggregate" column, and then make a table
  kpath.mat <- kpath.census(net, mode='graph', 
                            tabulate.by.vertex=TRUE)$path.count
  kpath.mat.knum <- kpath.mat[knum,-which(colnames(kpath.mat)=='Agg')]
  df <- data.frame(table(kpath.mat.knum))
  # Make sure the distance is recorded as numeric
  # not character
  numeric.dist <- min(kpath.mat.knum):max(kpath.mat.knum)
  names(numeric.dist) <- as.character(numeric.dist)
  df$kpath.count <- numeric.dist[names(numeric.dist)%in%df$kpath.mat.knum]
  # Print mean - include zeros
  kpath.mean <- sum(prop.table(df$Freq)*df$kpath.count)
  cat('\n\n------------',netname, '-------------\n')
  cat('\nTotal nodes:', sum(df$Freq), '\n')
  cat('Nodes on average are in', round(kpath.mean,2), knum, '-paths\n')

  # Add column name network name
  colnames(df)[3] <- paste('Number of', knum, 'paths')
  df$Network <- netname
  return(df)
}

# Get all tables into one data frame
kpath <- rbind(kpath.table(net.precov, 'Pre-Covid'),
             kpath.table(emptynet, 'Pure Isolation'),
             kpath.table(net.essl, 'Essential Only'),
             kpath.table(net.comb.1, 'Just One Friend'),
             kpath.table(net.comb.2, 'Just One Friend Per HH'))
netterie commented 4 years ago

@martinamorris Here's the mean plus the distribution summary for each network:

------------ Pre-Covid -------------

Total nodes: 200 Nodes on average are in 7617.6 3 -paths Summary of distribution: Min. 1st Qu. Median Mean 3rd Qu. Max. 1978 5493 7163 7618 9546 19104

------------ Pure Isolation -------------

Total nodes: 200 Nodes on average are in 0 3 -paths Summary of distribution: Min. 1st Qu. Median Mean 3rd Qu. Max. 0 0 0 0 0 0

------------ Essential Only -------------

Total nodes: 200 Nodes on average are in 2.74 3 -paths Summary of distribution: Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00 0.00 0.00 2.74 2.00 54.00

------------ Just One Friend -------------

Total nodes: 200 Nodes on average are in 56.8 3 -paths Summary of distribution: Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00 16.75 38.00 56.80 77.00 345.00

------------ Just One Friend Per HH -------------

Total nodes: 200 Nodes on average are in 13.54 3 -paths Summary of distribution: Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00 0.00 7.50 13.54 18.00 133.00

sgoodreau commented 4 years ago

Thanks much, @netterie! Meanwhile, @martinamorris and I had a side conversation to try to get on the same page. I think we may be narrowing in on something. We both agree on a switch from paths counts to counts of HH within a given distance ("degree of separation" in the common parlance). These may sounds the same but aren't. I'm more interested in 3 degs of sep and she in 6, for various reasons, but once we calculate one it's easy to do both.

So what I think we have converged on (unless I'm still misunderstanding some aspect of the discussion) is a metric for the mean number of HH within X degrees of separation from a HH, where X=c(3,6).

In other words (just to be absolutely unambiguous):

I think this is actually a fairly easy calculation from the geodist matrix. I will aim to do it tomorrow, unless you are able to get to it first.

Thanks!!

netterie commented 4 years ago

@sgoodreau @martinamorris I see why you like this statistic - it does strike me as more straightforward to think about! I added one more statistic, "average reachable" - instead of counting geodesic <=X, I just counted reachable nodes of any distance, before taking the mean.

                            3      6 avg.reachable
Pre-Covid              199.00 199.00        199.00
Pure Isolation           0.00   0.00          0.00
Essential Only           4.22  10.11         15.50
Just One Friend         35.17 147.97        161.12
Just One Friend Per HH  11.97  49.47         97.60

Code

# Geodesics separation
geo.sep <- function(net, netname, distances=c(3,6)) {
  # Geodesic matrix - if not reachable, use NA
  geo.mat <- geodist(net, inf.replace=NA)$gdist
  # How many households are separated by distance x or less?
  # Do a node-based, not dyad-based, mean
  means <- sapply(distances, function(x) {
    geo.count <- (geo.mat<=x)
    # geo.count[lower.tri(geo.count, diag=TRUE)] <- NA
    diag(geo.count) <- NA
    return(mean(colSums(geo.count, na.rm=TRUE)))
  })

  # Compute reachable nodes
  reachable <- geo.mat
  diag(reachable) <- NA
  num.reachable <- colSums(!is.na(reachable))
  #print(summary(num.non.isolates))
  avg.reachable <- mean(num.reachable)

 # Combine distances counts with reachable count
  means <- c(means, avg.reachable)
  names(means) <- c(distances, 'avg.reachable')

  return(means)
}

# Compute
distances <- rbind( geo.sep(net.precov, 'Pre-Covid'),
    geo.sep(emptynet, 'Pure Isolation'),
    geo.sep(net.essl, 'Essential Only'),
    geo.sep(net.comb.1, 'Just One Friend'),
    geo.sep(net.comb.2, 'Just One Friend Per HH'))

# Labels
net.order <- c('Pre-Covid', 'Pure Isolation', 
               'Essential Only', 'Just One Friend', 
               'Just One Friend Per HH')
rownames(distances) <- net.order

# Show
distances
dth2 commented 4 years ago

I love that! I thought that the reachability within 6, which is canonical, would end up being meaningless on a network this small but I think it really works here. All of this work is fantastic! Go Team!

martinamorris commented 4 years ago

I like reachable too (always have :).

Still think it is worth leveraging the cultural familiarity with 6 degrees -- but the reachable tracks it closely. Would be nice to see the actual distributions overlaid -- not just the averages :)

sgoodreau commented 4 years ago

This is awesome, thank you!!!!!!! Will implement now on the webpage.

sgoodreau commented 4 years ago

OK, I have just implemented both 3 and 6 degs of sep throughout the document, and redone all of the writing to reflect this.

And, having spent a few hours on this, I now realize that it's too much. We have three metrics in there, and they all tell essentially the same story. It will overwhelm people, for no good reason.

So after all this, I find myself thinking maybe we should reduce all the way down to one metric and make life easy. The question is which one.

Please take a look and see what you think.

Some constraints as you think about options:

  1. Someone from UW News already has a press release that includes the reachability numbers. So if we wanted to take that out (e.g. only do 6 degs of sep) then we need to contact him ASAP. If it weren't for this, I would actually be fine with doing only 6 degs of sep. So if, e.g., Martina, you read this and strongly prefer this option, you should contact him ASAP to tell him to hold off spreading.

  2. If we decide to go from 3 metrics down to 2, I feel quite strongly that it should not be c(largest component size, 6 degs of sep). Both of these get at long distance paths, and I feel strongly (always have, although perhaps I am alone in this) that for something with a short duration (like COVID) and/or low transprobs (like HIV), shorter paths tell a key part of the story. I can give that up if we're just doing one metric, but would want that in if we're doing two.

  3. I am now done for the day, having obligations to my newly reconstituted household. And was planning to take much of tomorrow off as well to deal with car issues and the like.

So see what you think, given the current setup and these constraints.

Thanks! Steve

sgoodreau commented 4 years ago

Forgot to tag you all in the previous comment to make sure you saw it. @martinamorris @dth2 @netterie @EmilyPo

Also note that I built it to SocNetDist.html in the github repository, but did not copy it over to doc/index.html to push it to the public. So make sure to look in the right place.

sgoodreau commented 4 years ago

OK, @martinamorris @dth2 @netterie @EmilyPo, I slept on it and decided that I could live with a switch to largest component and 6 degs of sep, if that's what everyone else thinks is the best combo :-)

Share your vote, and I will implement and push this evening or tomorrow. And then close this chapter!!!

EmilyPo commented 4 years ago

I really like the 3-path and the 6-path language (since they illustrate two slightly different things) BUT I understand that for clarity we need to streamline the metrics. Component size and 6 degrees of separation is good.

One thing I will mention is that perhaps we should briefly define how we're going to analyze the networks at the beginning (i.e. component size & 6-paths - just for some clearer signposting)? I know that might feel like a lot of text right at the beginning, but I think it would help readers to know what to expect when clicking over to the "what's going on" tabs.

martinamorris commented 4 years ago

At the risk of being labeled "always the contrarian," I think we have room for 3 and 6. I still like the idea of shaking up people's intuitions. Think you can see 3 steps out to assess your risk? Here's how many HH are 3 steps away from you. On top of that are the ones you know you can't see that really add up: x HH are 6 degrees of separation away.

We need to use the right language now. We're no longer talking about paths (which are multiply embedded -- 3 paths are also 3-choose-2 2 paths). We're now talking about geodesics between pairs of persons (one per pair). So, at least here, among ourselves, we should ditch the path language, as it's confusing.

FINALLY: I think we should table the results in the "What's Going On" tabs:

Way to think about In this network Your risk
The connected cluster xx HH yy% chance that your HH is connected
HH 3 deg of sep away aa HH bb% of HH can reach you in 3 steps
HH 6 deg of sep away cc HH dd% of HH can reach you in 6 steps

There can be text describing the increase/decrease from the previous network, but this anchors the primary findings in a repetitive frame that will make it easier to read/comprehend.

dth2 commented 4 years ago

I suspect that given the length of the document with all of the explanation I think people are either not going to read through it, or they are going to be in for a penny in for a pound. The difference between including geodesics of length 3 or less will not be a tipping point for audience penetration. If my suspicions are correct I think it is worth keeping both three and six because they tell different stories. IF you are going to choose just 1 I would say 6 because it is more likely to resonate.

martinamorris commented 4 years ago

One last comment -- the first storyboard (Good Ol Days) still scrolls off the screen. Too many words. 15 ties is duplicated, lots of opportunity to shorten and still make the key points.

sgoodreau commented 4 years ago

Thanks all! OK, glad to know I wasn't off on my own limb with having both the 3 and 6 degs of sep. The version with those is uploaded. And @dth2, you're right -- the mathy bits will either draw you into that path or not, and this isn't the tipping point.

@martinamorris I agree with the table at the end - I had wanted to do that on Friday and reached my limit. Will add tomorrow if I can.

And will also trim Good Ol' Days.

As for adding some of the definition language up front, @EmilyPo - did you mean at the very top, before we split into the "visual" and "math" tracks? If so, I don't think that's the right path to take. As of now, it is set up so that most readers can go through the whole thing without hearing or seeing any numbers or complex terms. And I think we need to keep it way.

Putting in context, my high school friend who is now getting her PhD in Health Communications said that even the path that just looks at the figures and skips all the math is written at a college level. And if you want to have something be broadly understood by the public it should be at more like a 5th or 8th grade level. They get special training in how to write like that. So we shouldn't add any jargon into that path at all. Karin actually said she might try to write up an accessible English version for us,although she works for the CT DOH so has a pretty full plate herself right now :-)

Thanks all!!!!