ReproNim / reproschema-py

Apache License 2.0
2 stars 8 forks source link

[wip] updating reproschema commands to the new pydantic model #36

Closed djarecka closed 3 months ago

djarecka commented 5 months ago

@satra - for now I updated only validate, but can you please take a look if this is what you had in mind?

also, I wasn't sure where the context should be, for now I added to models

djarecka commented 3 months ago

@djarecka yes, isVis should be False for compute items. for compute there is no isVis

but the newly generated compute items are not correct. there are many repetitive items, you can see a lot of pex_bm_apa1_flag01 here. This is from pex_bm_apa_schema (I can't attach the file here but sent to your slack. this is generated from the HBCD csv I shared with you on slack last time)

can you check now. I think the loops in the codes was wrong and it went multiple times through rows. Now the codes runs much faster...

yibeichan commented 3 months ago

yessss! it's way faster and all repetitive results have gone!! thank you!!

yibeichan commented 3 months ago

@djarecka what are the remaining issues we need to fix before merging this PR?

djarecka commented 3 months ago

the demo protocol has a file upload item. perhaps check that out.

yes, I found documentUpload, perhaps reproschema-ui should be updated to give this option (@yibeichan )

what about sql that I see in some rows in hbcd, there are usually some "labels", e.g. "illness label". In addition that I don't know how to represent in reproschema inputType (I believe we don't have sql), I'm guessing that this should be hidden?? I can even see in Field Annotation of the csv "@HIDDEN", but in our code "Field Annotation" is translated to "description", so I'm a bit lost...

I the "Field Annotation" I can see multiple mysterious values, that I'm pretty sue are not human-readable descriptions, but have no ide what to do with them, e.g. "@HIDDEN @NOW-SERVER" or "@NONEOFTHEABOVE=5"

yibeichan commented 3 months ago

yes, I found documentUpload, perhaps reproschema-ui should be updated to give this option (@yibeichan )

yes, it's in the UI: https://github.com/ReproNim/reproschema-ui/blob/05096f07b3c27e82ee9382849b073c0896448a47/src/components/Inputs/DocumentUpload/DocumentUpload.vue#L2 (is this what you need?)

what about sql that I see in some rows in hbcd, there are usually some "labels", e.g. "illness label". In addition that I don't know how to represent in reproschema inputType (I believe we don't have sql), I'm guessing that this should be hidden??

for sql we should treat it as the same we do for calc; it's a calculation with a different language. so we probably need to translate SQL to JS expressions or we extract the SQL and put it in compute for now but fix it later @satra what do you think? and we should use isVis: false for sql too

I can even see in Field Annotation of the csv "@hidden", but in our code "Field Annotation" is translated to "description", so I'm a bit lost...

sorry about this, yes, it should not be translated to description directly. I found most rows in Field Annotation have @hidden so we should use Field Annotation for multiple attributes such as isVis, other attributes usage see below

I the "Field Annotation" I can see multiple mysterious values, that I'm pretty sue are not human-readable descriptions, but have no ide what to do with them, e.g. "@hidden @NOW-SERVER" or "@noneoftheabove=5"

yes, for @LANGUAGE-CURRENT-SURVEY we are supposed to specify something language related. It's row 1876, a radio for choosing language. so it's like depending on which language a participant chose, we need to display reproschema in that language (something like selected_language in the UI)

row 1877, it has @NOW it should be date related, same as @NOW-SERVER

row 1879, it has @IF([event-name][visit_start_complete] != 2, @DEFAULT='1', @DEFAULT='2') we should treat it as condition or branch logic

row 1886 @NONEOFTHEABOVE = '4,999,777', should be treated as condition/branch logic too.

row 2455 @CALCTEXT(if([pex_bm_healthv2_preg_i_illness_019_i_03] != "", [pex_bm_healthv2_preg_i_illness_019_i_03], [pex_bm_healthv2_preg_i_illness_019_i_05])) this item should be treated as compute.

that's a lot of variations, we probably need a dictionary about how to map them. I can take a look at all HBCD and bridge2ai CSV files and find unique values of Field Annotation and create a dictionary for you this weekend

djarecka commented 3 months ago

@yibeichan @satra

I want to make two main points from HBCD:

These issues are completely unrelated to my changes. I've been fixing multiple old issues with the code, since I'm getting newer examples, and I can keep doing this (if someone helps me with it), but perhaps this should be done in another PR? However, this seems like critical issue, so if I add to compute or isVis that I see in cvs files no one should expect the schema to work with ui.

yibeichan commented 3 months ago
  • I believe neither requirements that are marked as sql (e.g. SELECT distinct label from redcap_web_service_cache WHERE value in (SELECT ...., not marked just calc (e.g. $('#pex_bm_healthv2_preg_i_illness_016_i_03-tr select option:nth(1)').prop('selected',true)) doesn't look to me as something that could be just added to reproschema:compute An example from reproschema that I found has forms like this

you mean SQL cannot be directly put into jsExpression right? yes, I agree, that's why I asked "for sql we should treat it as the same we do for calc; it's a calculation with a different language. so we probably need to translate SQL to JS expressions or we extract the SQL and put it in compute for now but fix it later?"

  • for the branching logic, I found examples here, so the the form from "Branching Logic" in HBCD from that looks like [sed_bm_demo_roster_003] >= 1 AND [sed_bm_demo_roster_004_i_01] = '0' can't be directly included in isVis, but at least I know that it could be translated. However, for the items that could be found in Field Annotation I really have no idea what to do (well, @Hidden is easy to guess), I could add @NONEOFTHEABOVE = '4,999,777' to the reproschema schema, but I really do not think reproschema-ui will know what to do.

so among all HBCD csv files I have, Field Annotation has the following unique values: {'@NOW', '@NONEOFTHEABOVE', '@READONLY', '@IF', '@NOW-SERVER', '@LANGUAGE-CURRENT-SURVEY', '@CALCDATE', '@CALCTEXT', '@HIDDEN'}

@NOW, @NOW-SERVER should be something related to date, UI has something called [Timer](https://github.com/ReproNim/reproschema-ui/blob/05096f07b3c27e82ee9382849b073c0896448a47/src/components/Timer/Timer.vue#L51), could be related

@READONLY can be treated as description

@IF can be treated as branch logic

@CALCDATE has examples as @CALCDATE([screening_arm_1][setup_lmp], 0, "d") it should do some computing but yes, it's a complicated issue

@CALCTEXT has examples as @CALCTEXT(if([pex_bm_healthv2_preg_i_illness_015_i_03] != "", [pex_bm_healthv2_preg_i_illness_015_i_03], [pex_bm_healthv2_preg_i_illness_015_i_05])), we can put it into isVis but reproschema may not understand it

@NONEOFTHEABOVE has examples as @NONEOFTHEABOVE='12,999,777' this can will probably need to be specified based on the response option everytime it appears. but I'm not sure what this @NONEOFTHEABOVE is used for later.

@LANGUAGE-CURRENT-SURVEY is related to participant's choice of language. we have language in the UI here but it's for landing page

@HIDDEN everytime we have it, we should set isVis: false

These issues are completely unrelated to my changes. I've been fixing multiple old issues with the code, since I'm getting newer examples, and I can keep doing this (if someone helps me with it), but perhaps this should be done in another PR? However, this seems like critical issue, so if I add to compute or isVis that I see in cvs files no one should expect the schema to work with ui.

I agree that those are complicated but critical issues. we should get this PR merged, release out, and open another PR to fix those issues

yibeichan commented 3 months ago

@djarecka anything else we need to check/update in this PR? If not, I guess we can merge it and move on (to get the release out)

djarecka commented 3 months ago

Don't have good internet today, but working on a couple more commits and later today I will add them to the pr and will create issues with things that need to be done in addition

On Mon, Jun 10, 2024, 9:49 AM Yibei Chen @.***> wrote:

@djarecka https://github.com/djarecka anything else we need to check/update in this PR? If not, I guess we can merge it and move on (to get the release out)

— Reply to this email directly, view it on GitHub https://github.com/ReproNim/reproschema-py/pull/36#issuecomment-2158574300, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMV6GT2V32AFQGC53KJRSLZGW4IZAVCNFSM6AAAAABFZZZXJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGU3TIMZQGA . You are receiving this because you were mentioned.Message ID: @.***>

djarecka commented 3 months ago

@yibeichan - at the end I added many command to do the script redcap2reproschema, please take a look if these make sense to you. I decided to explicitly add instruction for all elements to INPUT_TYPE_MAP, and VALUE_TYPE_MAP (previously the default behavior was to not apply any mapping, but that lead the elements that were completely ignored by reproschame) , so if something is not in the dictionaries, you will get an error, and that means that we have to decide how it should be represented in reproschema.

re. compute - maybe I wasn't clear on Sat, but I believe that even the elements in calc would not be interpreted by reproschema-ui, but for now I left it (and added comment, that should be also added to issues)

If that looks goo, one thing we should change for sure, is the path to contextfile

yibeichan commented 3 months ago

@djarecka thanks for the work!! I tested nimh-minimal with a roundtrip: reproschema2redcap and redcap2reproschema. Everything looks good, except the branching logic. In nimh-minimal, we have a special branching logic case. Whether participants will answer DSM5 youth or DSM5 parents depends on the interview age (an item from the previous activity). We had some problems making the ui work for this case, but @satra found a (temporary) solution for ui. So, it's reasonable that our current (general) reproschema2redcap and redcap2reproschema cannot catch this logic.

I'm going to merge this PR now