josephholler commented 1 year ago

There are multiple parts to these workflows, so here's a separate issue to track them.

[x] Finalize coding scheme
[x] Independently code
[x] Synthesize coding results
[x] Finalize consensus coding

We defined replication as:

A Replication is any study that seeks to evaluate the claim of a prior study using similar procedures and new data. The claim made in a prior study has been replicated when the replication produces outcomes that are consistent with the prior claim and increase confidence in that claim.
The data or procedures used in the replication study may be quantitative, qualitative, or mixed.

We may code responses for:

Reproducibility: have they defined replicability to match the NASEM definition for reproducibility, as in the same procedures with the same data should yield the same results, or the research should be published with sufficient transparency for another researcher to repeat the same procedure?
Results: does the definition state that reliable or similar results should be achieved, or the claims should be confirmed through the same or similar results?
New Data: does the definition state that new data should be used, or that the procedures should be repeated in a new context (geographic, temporal, or otherwise)? This may be implied by stating that claims should be repeated across multiple studies.

Note that code for reproducibility without a code for results and new data indicates confusion between the terms reproducibility and replicability

josephholler commented 1 year ago

Note: please wait until I have the chance to go through more of these to stabilize a coding scheme.

josephholler commented 1 year ago

I've run through a full coding with a scheme that we can discuss before moving forward. These are the main codes of interest:

NASEM Replicability

data-new: new study should acquire/generate new data
methods-similar: new study should apply the same methods
context-new: new study should change conditions, context, time, scale, or geographic location
results-similar: new study should achieve similar or comparable results. may be implied with use of term "reproduce" or "replicate"

Reproducibility or Reanalysis

data-same: new study should use the same data source
methods-varied: new study should alter the methodology (reanalysis)
context-same: new study is conducted with the same location/population, or selects a new location/population with similar characteristics
results-same: new study should achieve same results or same within margin of error

Flags of interesting themes:

open-repeatable: emphasizes quality of research as reproducible or repeatable based on publication, code, data, etc.
validate-external: There is an emphasis on externally validating or generalizing with results from new contexts and studies, or an emphasis on comparing findings / conclusions based on results from new studies. Most definitions including new context and similar results will get this flag.
unique-space-time: recognition of challenges that spatial / temporal heterogeneity will cause for replication
validate-internal: Explains replication as a process internal to the research project by repeating and comparing samples/measurements/locations within a population/experiment/landscape, or explicitly references internal validation as a goal of replication through internal repetition, reanalysis, and/or assessment of construct validity. Other codes are left blank if the answer exclusively refers to internal research validation.
epistemology: discusses epistemological and ontological problems with replication of qualitative research methods

From these codes, we can derive different types of responses. E.g. a respondent understands replicability as reproducibility if they only have codes for open/repeatable, same/similar results, same data, same methods, and same context, or a similarity to the NASEM definition can be derived by combining "new data", "same methods", "new context", and "same" or "similar results".

We may want to combine "same" and "similar" results together into one indicator, because it is sometimes difficult to discern the difference.

josephholler commented 1 year ago

see data/derived/public/q6_coding_jh.csv

josephholler commented 1 year ago

I revised column headers to good column names and adjusted the list of variables accordingly. I also saved this as q6_coding_jh.xslx and ordered the columns to parallel above: essential variables for calculating similarity to NASEM definition first, followed by other interesting flags.

I calculated replicability-nasem = sum(data-new, methods-similar, context-new, results-similar) - sum(data-same, methods-varied, context-same, results-same) and sorted on this variable. It gives a good first pass of placing definitions on a gradient from replicability to reproducibility.

I avoided trying to infer codes from ambiguous definitions unless it was fairly obvious (e.g. an archeologist commenting that no one else could re-dig obviously implies they are thinking same data and same context).

josephholler commented 1 year ago

@Peter-Kedron : the combination of context-new and results-similar informs us about confirming results in new contexts, whereas we could reserve the validate-external flag for higher order thinking about validating claims, findings, understanding of phenomena, etc. I'm not sure where that would put the climate reconstructions. Thoughts? It's just a flag for finding interesting definitions, as opposed to a variable that ascertaining similarity to the NASEM definition hinges upon.

Let's keep this comment up-to-date with coding definitions: https://github.com/HEGSRR/OR-Replicability-in-Geography-Survey/issues/5#issuecomment-1432059649

Peter-Kedron commented 1 year ago

@josephholler, Did my definition checks last Thursday and forgot to load them until today. pk_code column 0 is agreement with joe, 1- flags a def for discussion. Let's go through those next full meeting (3/ 16).

@SarahBardin, I would say don't code these yourself. Just join in our discussion of the 'conflicts' and weigh in there.

josephholler commented 1 year ago

the last work on this is in data/derived/public/q6_coding_jh.xlsx

Peter-Kedron commented 1 year ago

@josephholler when you come back online, I have rechecked the q6 coding and updated the most recent file as data/derived/public/q6_coding_jh_pk.xlsx. The last column of that file now has my explicit suggestion and reasoning for that suggestion on the remaining conflicting definitions.

There are only 31 conflicts remaining. You can filter to them using pk-flag = 1

When you are back, please go through that file and Approve/Disapprove the recommendations. If after that review there are still a few you want to discuss, let me know and we can finish this task.

josephholler commented 1 year ago

@Peter-Kedron I looked over the 31 remaining conflicts and changed codes according to my note in jh-response column. The majority of notes are ok and accept your suggestion while a minority justify keeping the code as-is. If it looks good, I think we need to remove the other codes from excluded cases, or make sure we exclude them when we analyze.

Peter-Kedron commented 1 year ago

@josephholler, Thanks Joe. I went through and resolved all of the definitions. Should be good to go now. I'll proceed with the initial analysis and write-up of this section in overleaf.

Peter-Kedron commented 1 year ago

@josephholler, I am setting up the results summary for Q6. What do you think about the following?

I want to code the match to the NASEM definition along three dimensions as follows:

NASEM Definition: Replicability to mean obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data

data-new = 1, this is clear in the definition
method-same = 1, this is not explicit in this definition, but reading the report this is the most consistently observed definition. Also, as you make the point in the coding, similar tips into re-analysis and a distinction between direct/conceptual replication.
results-combined = 1. picking up on your note at the end of the definitions post in this string, same v. similar is too hard to parse here and I don't think an imporant difference. Either could be read as in line with 'consistent' and I don't see the value in trying to hash out a line between them from thin info in the defs.

My plan is to sum across those three. I am omitting context from that sum because the definition is not clear on that. Unlike replication where same data necessitated same context, we don't have that implicit que here. So I want to leave it out of the summary measure, but will discuss it and provide percentage stats to contextualize the collective thinking.

josephholler commented 1 year ago

So you propose that: nasem-replication = data-new + methods-same + (results-same OR results-similar)
That's good; your reasoning is well justified. There are two versions of this data: a working excel sheet, a CSV saved from the excel sheet, and there is an explanation of the measure in the metadata markdown file.

HEGSRR / OR-Replicability-in-Geography-Survey

Q6 Definition Coding #5

We defined replication as:

We may code responses for:

NASEM Replicability

Reproducibility or Reanalysis