Closed jcwhitmer closed 1 year ago
@jcwhitmer, I'm looking at the last pull and @cb-air seems to have entirely deleted the table from the readme.
[edited by PB, wrong link]
@jcwhitmer, I reviewed the files and the order and they agreed. Could your version be behind? or maybe the book isn't picking up the edits?
Looking at the first, it seems @cb-air's pull was closed by Charles. The text seems to say it was merged. The symbol indicates it was not merged. I have to admit, I don't have a lot of experience rejecting merges on GitHub, so I don't know what it looks like.
@mathgie it seems @cb-air's actually merged PR removed the table in book.md but, nevertheless, there it is, not fixed.
I'm not sure what's going on, but maybe github is buggy or it has something to do with the locked repo?
In any case, here's the book.md in my only active branch (updated earlier today):
To add info, in case it's helpful, when I did the most recent PR today (#9), I closed the PR from yesterday (8#) to prevent any confusion. But both of these should have had the variable order correct, and the "item information" (double scored) table removed.
It looks like the book.html currently in the main branch reflects the book.md from my most recent PR.
What's odd is a started to make a pull request from that branch Charles linked from, and it showed nothing to merge. So the file is different, but the same. GitHub is confusing me today.
Are you using "/cb-duplicate_PR_same_files" branch?
I assume that we're working from the main branch unless told otherwise, and that any substantive changes would be merged there upon completion.
The fact that @cb-air's PR shows no changes means they're synced.
GitHub doesn't work quite how I first imagined. It doesn't work by editing files; it works by implementing commits. Because of that, a merge doesn't always do what you expect. As I recall, it doesn't always do what is in the change log. I think I wrote to GH and they explained to me this was a feature somehow.
[edit by PB to make first two sentences into a thought.]
Data for the competition has been aggregated into a single file from multiple test items. For this challenge you will be using items from the grade 4 and grade 8 NAEP Math Assessments that were administered in 2017 and 2019. Information about the aggregated file and how it was prepared, along with general instructions for the challenge and data handling rules are contained below. Questions about the challenge should be posted to the Github "issues" page for the challenge: https://github.com/naep-as-challenge
Some variables about the item, responses, and respondent were available for all items in the source data. Those variables are described in the table below.
Variable | Description | Type | Values (if constrained) |
---|---|---|---|
student_id | pseudonymous student ID -- not linkable across item-years | string | e.g. "xYzq4StVaC" |
accession | Item number | string | e.g. "VH139087" |
score_to_predict | Outcome to predict | integer | e.g. 1, 2, 3 |
predict_from | Text related to "score_to_predict" | string | "Because A>B" |
year | Year assessment was administered | integer | 2017, or 2019 |
srace10 | Student's race reported by the school | string | (1='White, not Hispanic', 2='Afric Amer, not Hisp', 3='Hispanic of any race', 4='Asian, not Hispanic', 5='Amer Ind/Alaska Nat', 6='Native Ha/Pac Island', 7='>1 race, not Hispanic') |
dsex | Student's sex | integer | 1=male, 2=female |
accom2 | Student accommodations. Note: Item VH304954 did not have accom2 so for this item accom2 is entirely NA. | integer | 1='Accommodated', 2='Not accommodated' |
iep | IEP | integer | 1=SD, 2=Not SD |
lep | English learner status | integer | 1=English Learner, 2=Not English Learner |
rater_1 | Score given by human rater (component-scored items only) | string | e.g. 1A, 2B, 3A … |
pta_rtr1 | Part A human rater score (composite items only) | string | e.g. 1, 2A, 2, 3A … |
ptb_rtr1 | Part B human rater score (composite items only) | string | e.g. 1, 2A, 2, 3A … |
ptc_rtr1 | Part C human rater score (composite items only) | string | e.g. 1, 2A, 2, 3A … |
composite | Composite score (atomic-scored items only) | integer | e.g. 1, 2, 3 |
score | Score (containing partial credit codes) | string | e.g. 1A, 2B, 3A … |
assigned_score | Simplified numeric score total for item (1, 2, 3...) from either "rater_1" or "composite" | integer | 1, 2, 3 … |
ee_use | Item used equation editor | integer | 0=no EE use, 1=EE use |
There are four "Type II" items which were composed of multiple
sub-items or parts that each have their own set of scores and response
fields. For the purpose of the challenge, participants are requested to
score the combined overall score (score_to_predict
), based on the constructed response component which
we believe is the most salient (predict_from
), using NLP. For the six other items, called "Type I"
items here, there are multiple parts within an item; however, these
parts are considered dependently linked portions of the item and, as
such, were assigned a single score that encompasses the responses
contained within both parts.
For the "Type II" items, the sub-item scores have been combined
into a single "assigned_score" variable which is described in the common
variables table above. The original part scores are also included and
can be decoded using the item scoring guides provided in Item information.zip
which will be provided to participants with the responses upon approval of the
data application.
Note that this composite variable is not always the outcome which
contestants should predict. To make it clear which outcome contestants
should predict, we've created a variable "score_to_predict
" which is
the field which will be used as the outcome variable to create predicted
scores for. We've also created a variable named "predict_from
" to
identify the text with the most relevant constructed response text to
use when creating predicted scores.
The original item data contained extended constructed response and short constructed response (ECR and CR) text, item selections for multiple choice, and some process data (such as response "eliminations" for CR items) embedded within a json data structure, with MathML (XML) equation editor codes nested inside. The original test item data had different XML structures for each item, and within item there are differences in the XML coding between the year of administration. These differences may impact how predictive models will perform across years.
These data have been parsed to make them easier to process. The parsed XML data, in contrast to the common variables listed above, are different for each item. The item specific variables are described below the item name in the list that follows. Please note, the format of the data values for the process data (e.g. eliminations) may differ by year for the same item. For example, eliminations may be recorded as "(1, 2, 5)" in 2017 and "1, 2, 5" in 2019.
Also note, the CR text has been parsed but not completely cleaned. The data was analyzed for sensitive information (e.g. personally-identifiable information, profanity, toxic language) and some responses were removed as a result. However, spellcheck has not been applied to correct what may be obvious spelling errors.
Please consult the scoring guides included in Item information.zip
to
map the fields below to the question areas.
parsed_xml_v1-- Text for ECR item response.
parsed_xml_v1-- SCR text
\
parsed_xml_v2-- ECR text
source1-- drag and drop tile "from"
\
source2-- drag and drop tile "from"
\
source3-- drag and drop tile "from"
\
source4-- drag and drop tile "from"
\
target1-- drag and drop tile "to"
\
target2-- drag and drop tile "to"
\
target3-- drag and drop tile "to"
\
target4-- drag and drop tile "to"
\
parsed_xml_v1-- CR text
parsed_xml_v1-- ECR text
\
selected-- MC radio button choices as a logical vector (e.g. "FALSE
FALSE TRUE FALSE") for 2019 only.
\
eliminations-- MC item eliminations as a variable length numeric
vector (e.g., c(1,3,4)) for 2017 only.
\
eliminated-- MC item eliminations as a length 4 logical vector
(e.g., TRUE FALSE FALSE TRUE) for 2019 only.
selected1-- 1st MC item option radio button 1
\
selected2-- 1st MC item option radio button 2
\
selected3-- 1st MC item option radio button 3
\
selected4-- 1st MC item option radio button 4
\
selected1.1-- 2nd MC item option radio button 1
\
selected2.1-- 2nd MC item option radio button 2
\
eliminated1-- 1st MC item elimination option radio button 1
\
eliminated2-- 1st MC item elimination option radio button 2
\
eliminated3-- 1st MC item elimination option radio button 3
\
eliminated4-- 1st MC item elimination option radio button 4
\
eliminated1.1-- 2nd MC item elimination option radio button 1
\
eliminated2.1-- 2nd MC item elimination option radio button 2
\
parsed_xml_v1-- ECR text
partA_response_val-- 1st MC item drop down menu selections as
numeric vector (e.g. c("1","1")) in 2017, and a fixed length logical
vector in 2019.
\
partB_response_val-- 2nd MC item radio button selections as vector
(e.g. c("1","")) in 2017, and a fixed length logical vector in 2019.
\
partB_eliminations-- MC item eliminations for part B, format differs
by year.
\
parsed_xml_v1-- ECR text
\
Note-- For both the response values and the eliminations, the format
of the data changes between 2017 and 2019. In 2017, eliminations are
stored as list of numbers, perhaps in chronological order (e.g.,"1",
"2", but also "2--1" and "1--2"). In 2019 the responses and eliminations
are stored as fixed length logical vectors (e.g., "TRUE TRUE").
parsed_xml_v1-- ECR text
\
parsed_xml_v2-- CR text
\
parsed_xml_v3-- CR text
parsed_xml_v1-- CR text
\
parsed_xml_v2-- CR text
source1-- drag and drop tile "from"
\
source2-- drag and drop tile "from"
\
source3-- drag and drop tile "from"
\
target1-- drag and drop tile "to"
\
target2-- drag and drop tile "to"
\
target3-- drag and drop tile "to"
\
parsed_xml_v1-- CR text
source1-- drag and drop tile "from"
\
source2-- drag and drop tile "from"
\
source3-- drag and drop tile "from"
\
source4-- drag and drop tile "from"
\
target1-- drag and drop tile "to"
\
target2-- drag and drop tile "to"
\
target3-- drag and drop tile "to"
\
target4-- drag and drop tile "to"
\
parsed_xml_v1-- CR text
The following plots provide information about the distribution of word counts for
the predict_from
constructed reponse field.
\
\
Approximately 5% of the NAEP item responses were double scored. Quadradic Weighted Kappa (QWK) was calculated to estimate the inter-rater reliability for the double-scored responses. The inter-rater reliability estimates for all items are presented below.
Table: N Counts for Test/Train Split
item | QWK | score type |
---|---|---|
VH134067 | 0.966 | Type I |
VH139380 | 0.981 | Type I |
VH266015 | 0.963 | Type II |
VH266510 | 0.933 | Type I |
VH269384 | 0.970 | Type II |
VH271613 | 0.975 | Type II |
VH302907 | 0.980 | Type I |
VH304954 | 0.984 | Type I |
VH507804 | 0.991 | Type II |
VH525628 | 0.956 | Type I |
To minimize the risk of statistical disclosure, suppression was applied
to demographic variables. To minimize the impact of suppression and
algorithm was developed which prioritized which of the suppression
variables were set to missing (NA). The suppression variables, listed in
the order in which they were prioritized, were the following: "dsex",
"iep", "accom2", "lep", and "srace10". The variable "year" was not
included in the suppression.
The table that follows shows the N counts for the test and training data
sets.
Table: N Counts for Test/Train Split
item | QWK | min | max | test | train | score type |
---|---|---|---|---|---|---|
VH134067 | 0.966 | 1 | 2 | 4,483 | 40,343 | Type I |
VH139380 | 0.981 | 1 | 3 | 2,018 | 18,157 | Type I |
VH266015 | 0.963 | 1 | 4 | 1,776 | 15,987 | Type II |
VH266510 | 0.933 | 1 | 3 | 4,296 | 38,667 | Type I |
VH269384 | 0.970 | 1 | 4 | 1,758 | 15,826 | Type II |
VH271613 | 0.975 | 1 | 4 | 3,096 | 27,858 | Type II |
VH302907 | 0.980 | 1 | 2 | 4,241 | 38,173 | Type I |
VH304954 | 0.984 | 1 | 3 | 2,743 | 24,686 | Type I |
VH507804 | 0.991 | 1 | 4 | 1,827 | 16,443 | Type II |
VH525628 | 0.956 | 1 | 3 | 1,808 | 16,275 | Type I |
@jcwhitmer I've been using GitHub for awhile now. I've never seen a repo in this state.
I wonder if @mathgie needs to rebuild the book.
Bizarre; @mathgie could you see if you can resolve this? I'd like to keep this repo, but we could always nuke it and start afresh if needed. Would be worth a post-mortem discussion once we are past the launch.
So I currently made a separate branch and PR that includes my new resources folder and a fixed version of book.md. I need to go through my copy of book.md one more time once my branch is approved to make sure that any stray conflicting parts of the doc are cleaned up and any lingering formatting/links will work, but I resolved most of the conflicts.
thank you @mathgie! I think this type of situation is the genesis of git's name.
@mathgie, the table is still there. I'm a bit confused, BTW, did @jcwhitmer want the table to simply drop double-scored or to be removed entirely?
I see the table
I think we want that gone, because we later have this
@mathgie @pdbailey0 remove that table entirely; as you note, it's redundant.
These issues are completed and I've integrated into my review; we will use PR from now on for changes.