lsms-worldbank / selector

Load SuSo meta data into chars and utilities using them
https://lsms-worldbank.github.io/selector/
0 stars 0 forks source link

`sel_is` command #9

Open kbjarkefur opened 10 months ago

kbjarkefur commented 10 months ago

@arthur-shaw , when I looked at this with fresh eyes I thought of a way that I think will be better to do the sel_is_numeric, sel_is_string commands. Creating one command for each sel_is_??? will be a pain to maintain across that many files. So how about we make the type a sub-command. Like this: sel_is numeric, sel_is string etc. Then there is only one command sel_is and it accepts the sub-commands below:

I can create this quite quickly. But let me know what you think first

arthur-shaw commented 10 months ago

@kbjarkefur , I like this idea. All of these commands would have shared code (e.g., capture variables in a macro, return the macro as r(varlist), message about number/list of variables, etc.). All of these commands would benefit from having the shared documentation with details on each sub-command. And all of these command would probably have a common API (e.g., limit scope of selection by variable name glob, etc.).

The only change I might propose is the name. For the main command, consider sel_vars. For the sub-command, consider is_{suso_type}. In pseudo English, this would read: select variables that are {suso_type}.

For the API, here's a link to one part of our past discussions.

For the implementation of these question type selectors, here's a synthesis of what I've done in cleanstart:

My only concern is that this would require a fair number of metadata characteristics. Some might be able to be combined (e.g., linked_to_roster_id and linked_to_question_id_var could be combined into a is_linked indicator). Most need to stay as his.

Here's my quick compilation:

kbjarkefur commented 10 months ago

Agreed. We have a command called sel_vars already, but I think we should change that to accomodate this.

Currently sel_vars has sysntax sel_vars "query_string", type("NumericQuestion"). I still like to keep the query_string as it allows the users to start using this commands on custom chars they create themselves. But lets move the query to an option and the sub-commands you suggest to the main parameter.

So it would be sel_vars is_numeric , query("query_string") where query is of course optional.

I will asked about the metadata types we have not discussed yet during our call today.

kbjarkefur commented 10 months ago

add option of varlist(varlist) to make it possible to pipe this from chained calls of commands in this package

kbjarkefur commented 10 months ago

is_single_select. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))

I do not find the variable linked_to_question_id in the meta data dta file. (I do find linked_to_roster_id)

kbjarkefur commented 10 months ago
  • is_multi_yn. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
  • is_multi_checkbox. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 1

These seems to have a copy paste issue as the conditions are duplicates

kbjarkefur commented 10 months ago
  • is_date. type == "DateTimeQuestion" & is_timestamp == 0
  • is_timestamp. type == "DateTimeQuestion" & is_timestamp == 0

I assume is_timestamp should be type == "DateTimeQuestion" & is_timestamp == 1

EDIT: also, the value for the variable is_timestamp is not 0/1 in the meta data, it is TRUE/FALSE. I would prefer to change this to 0/1 in the meta data as that is Stata practice.

arthur-shaw commented 10 months ago

is_single_select. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))

I do not find the variable linked_to_question_id in the meta data dta file. (I do find linked_to_roster_id)

You're right. I updated the data files on OneDrive to include this variable. This omission is due to an error in the susometa package. This exercise is helping me improve that package. Sorry that you're suffering through that process. Thanks for flagging issues.

  • is_multi_yn. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
  • is_multi_checkbox. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 1

These seems to have a copy paste issue as the conditions are duplicates

Correct. It should have been:

  • is_date. type == "DateTimeQuestion" & is_timestamp == 0
  • is_timestamp. type == "DateTimeQuestion" & is_timestamp == 0

I assume is_timestamp should be type == "DateTimeQuestion" & is_timestamp == 1

EDIT: also, the value for the variable is_timestamp is not 0/1 in the meta data, it is TRUE/FALSE. I would prefer to change this to 0/1 in the meta data as that is Stata practice.

Your assumption is right. Sorry for the copy-paste problem.

As for the values in the metadata, I've updated the data on OneDrive to have 0/1 values. While I expected TRUE/FALSE to get automatically converted to 1/0 values, I've not added some code to convert them explicitly. For the moment, I'm simply targetting variables matching is_*. If there are others, please let me know.

kbjarkefur commented 10 months ago

The fixes in the your last comment was implemented in 62e50c451150baae8a77926058f9a82a10aa5891

kbjarkefur commented 10 months ago

Adding a reminder here for you to update the helpfile of sel_vars following the updates in #10