isudatateam / datateam

ISU Data Team Effort
MIT License
5 stars 3 forks source link

DN_2 ready for review #436

Closed skbowd closed 2 years ago

skbowd commented 2 years ago

Within DN_2 dataset (INRS_2019_PartnerReportsHumanOutreach_Processed_C20191001)

1) Assuming only "Y" for within priority watersheds will be exported?

2) I noticed I  made comments in the comments column instead of creating a "data team notes "column.  Should I create this column and move my notes over?

3) Columns M, N, O, P need to be standardized.  Sometimes they are marked with an "x" indicating that topic was covered, and others there is text going into greater detail about the category.  Only column M is marked to be part of final analysis.  Thoughts? 

4) Some of the headings have red text, but I am not sure why.  Possibly because I added and/or edited the ones in black?

Data dictionary

lnowatzke commented 2 years ago

Assuming only "Y" for within priority watersheds will be exported? Yes, I think we should only export those events that are "Y" for within watershed variable in DN_02.

I noticed I made comments in the comments column instead of creating a "data team notes "column. Should I create this column and move my notes over? Yes, please create a data team notes column and move your comments over. That way we can leave in the "comments" information but not publish the data team notes.

Columns M, N, O, P need to be standardized. Sometimes they are marked with an "x" indicating that topic was covered, and others there is text going into greater detail about the category. Only column M is marked to be part of final analysis. Thoughts? I think we should remove M, N, O, and P when we publish. Like you noticed, these are very unstandardized and not sure they add much useful information. I changed column M to "No" in the data dictionary for including in final database publication.

Some of the headings have red text, but I am not sure why. Possibly because I added and/or edited the ones in black? They look okay to me and they match the data dictionary, so go ahead and turn them back to black.

one original variable name in red text (I don't know why), but is marked N for not part of final analysis, so maybe irrelevant? I changed it back to black. It looks fine to me and we're not including it anyway.

skbowd commented 2 years ago

Done. The only thing I'm not sure about is which column (comments or team_notes) to put my comments about duplicate entries that we discussed the other day. It's not original from the data source, but should it be included in export?

loriabendroth commented 2 years ago

I filtered only to Y and notice that these two columns are often blank. When I look at the DD, it appears that they should always be filled in for activity code and sometimes for dim1 (correct?). @skbowd , can you review these?

activity_code | dim1_method_scale

@loriabendroth @lnowatzke Oh yes, Laurie and I actually had this discussion the other day. They entries that are not classified are from 2019 that we will not be using. Should we add a column to this dataset for whether or not to export?

lnowatzke commented 2 years ago

@loriabendroth @lnowatzke Oh yes, Laurie and I actually had this discussion the other day. They entries that are not classified are from 2019 that we will not be using. Should we add a column to this dataset for whether or not to export?

@skbowd Yes, please use a new variable to indicate whether or not to export. Like you said, we're not exporting 2019 but we are exporting all earlier rows.

skbowd commented 2 years ago

@lnowatzke Will this new column be part of the final export? I have TBD under "Part of Final Analysis and Models" in data dictionary now.

skbowd commented 2 years ago

@lnowatzke