Open kbjarkefur opened 10 months ago
@kbjarkefur , I like this idea. All of these commands would have shared code (e.g., capture variables in a macro, return the macro as r(varlist)
, message about number/list of variables, etc.). All of these commands would benefit from having the shared documentation with details on each sub-command. And all of these command would probably have a common API (e.g., limit scope of selection by variable name glob, etc.).
The only change I might propose is the name. For the main command, consider sel_vars
. For the sub-command, consider is_{suso_type}
. In pseudo English, this would read: select variables that are {suso_type}
.
For the API, here's a link to one part of our past discussions.
For the implementation of these question type selectors, here's a synthesis of what I've done in cleanstart:
is_single_select
. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))
is_numeric
. Numeric question: type == "NumericQuestion"
has_decimals
. Is not an integer: is_integer == 0
is_text
. Is a text question: type == "TextQuestion" & mi(mask)
follows_pattern
. Text question that follows a pattern: type_var == "TextQuestion" & !mi(mask)
is_list
. List question: type == "TextListQuestion"
is_multi_select
. Is a multi-select question: type == "MultyOptionsQuestion"
is_multi_ordered
. Multi-select with question order recorded: type == "MultyOptionsQuestion" & are_answers_ordered == 1
. (NOTE: in the data shared, rename are_answered_ordered
to are_answers_ordered
. Will be fixing this upstream momentarily.)is_multi_yn
. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
is_multi_checkbox
. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 1
is_date
. type == "DateTimeQuestion" & is_timestamp == 0
is_timestamp
. type == "DateTimeQuestion" & is_timestamp == 0
is_gps
. GPS question: type == "GpsCoordinateQuestion"
is_variable
. Computed SuSo variable: type == "Variable"
is_picture
. Picture capture question: type == "MultimediaQuestion"
is_barcode
. Need to look this up. type == " "QRBarcodeQuestion""
My only concern is that this would require a fair number of metadata characteristics. Some might be able to be combined (e.g., linked_to_roster_id
and linked_to_question_id_var
could be combined into a is_linked
indicator). Most need to stay as his.
Here's my quick compilation:
type
linked_to_roster_id
, linked_to_question_id
is_integer
mask
are_answers_ordered
, yes_no_view
Agreed. We have a command called sel_vars
already, but I think we should change that to accomodate this.
Currently sel_vars
has sysntax sel_vars "query_string", type("NumericQuestion")
. I still like to keep the query_string
as it allows the users to start using this commands on custom chars they create themselves. But lets move the query to an option and the sub-commands you suggest to the main parameter.
So it would be sel_vars is_numeric , query("query_string")
where query is of course optional.
I will asked about the metadata types we have not discussed yet during our call today.
add option of varlist(varlist)
to make it possible to pipe this from chained calls of commands in this package
is_single_select. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))
I do not find the variable linked_to_question_id
in the meta data dta file. (I do find linked_to_roster_id
)
is_multi_yn
. Multi-select yes/no:type == "MultyOptionsQuestion" & yes_no_view == 1
is_multi_checkbox
. Multi-select checkbox:type == "MultyOptionsQuestion" & yes_no_view == 1
These seems to have a copy paste issue as the conditions are duplicates
is_date
.type == "DateTimeQuestion" & is_timestamp == 0
is_timestamp
.type == "DateTimeQuestion" & is_timestamp == 0
I assume is_timestamp
should be type == "DateTimeQuestion" & is_timestamp == 1
EDIT: also, the value for the variable is_timestamp
is not 0
/1
in the meta data, it is TRUE
/FALSE
. I would prefer to change this to 0
/1
in the meta data as that is Stata practice.
is_single_select. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))
I do not find the variable
linked_to_question_id
in the meta data dta file. (I do findlinked_to_roster_id
)
You're right. I updated the data files on OneDrive to include this variable. This omission is due to an error in the susometa package. This exercise is helping me improve that package. Sorry that you're suffering through that process. Thanks for flagging issues.
is_multi_yn
. Multi-select yes/no:type == "MultyOptionsQuestion" & yes_no_view == 1
is_multi_checkbox
. Multi-select checkbox:type == "MultyOptionsQuestion" & yes_no_view == 1
These seems to have a copy paste issue as the conditions are duplicates
Correct. It should have been:
is_multi_yn
. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
is_multi_checkbox
. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 0
is_date
.type == "DateTimeQuestion" & is_timestamp == 0
is_timestamp
.type == "DateTimeQuestion" & is_timestamp == 0
I assume
is_timestamp
should betype == "DateTimeQuestion" & is_timestamp == 1
EDIT: also, the value for the variable
is_timestamp
is not0
/1
in the meta data, it isTRUE
/FALSE
. I would prefer to change this to0
/1
in the meta data as that is Stata practice.
Your assumption is right. Sorry for the copy-paste problem.
As for the values in the metadata, I've updated the data on OneDrive to have 0/1 values. While I expected TRUE/FALSE to get automatically converted to 1/0 values, I've not added some code to convert them explicitly. For the moment, I'm simply targetting variables matching is_*
. If there are others, please let me know.
The fixes in the your last comment was implemented in 62e50c451150baae8a77926058f9a82a10aa5891
Adding a reminder here for you to update the helpfile of sel_vars
following the updates in #10
@arthur-shaw , when I looked at this with fresh eyes I thought of a way that I think will be better to do the
sel_is_numeric
,sel_is_string
commands. Creating one command for eachsel_is_???
will be a pain to maintain across that many files. So how about we make the type a sub-command. Like this:sel_is numeric
,sel_is string
etc. Then there is only one commandsel_is
and it accepts the sub-commands below:sel_is numeric
. All numeric questions.sel_is string
. All questions that contain text (i.e., Text and List types)sel_is text
. Text questions onlysel_is list
. List question onlysel_is multi_select
. All multi-select questions.sel_is multi_ordered
. Multi-select questions where answer order is captured.sel_is multi_yn
. Multi-select with yes/no answers.sel_is multi_checkbox
. Multi-select with checkbox input.sel_is gps
.I can create this quite quickly. But let me know what you think first