Sage-Bionetworks / sysbioDCCjsonschemas

SysBio DCC JSON schemas
1 stars 7 forks source link

consolidate template generation scripts #124

Closed avanlinden closed 2 years ago

avanlinden commented 2 years ago

Both the create_template_from_Syn_schema.py and create_template_from_Syn_schema_sorted.py scripts are a little difficult to use. For example, the "sorted" version has some code from the original version that wants a config file, but there's no args.config_file input: https://github.com/Sage-Bionetworks/sysbioDCCjsonschemas/blob/2df89feb4eaa98135c5db649ef90057aa09e0711/code/python/create_template_from_Syn_schema_sorted.py#L144

I propose that we combine both scripts into one and refactor the template creation process to have an option like "--sort" which if true would take an additional input specifying which JSON file to sort on. The sorting function is sort of critical because we need the individual or specimenID columns to be the first column in those templates.

Consolidating will also let us incorporate using syn.store() to automatically upload the excel template file to Synapse into the version of the script that also lets you sort the columns, which will be very handy. Then we should update the documentation in the repo readme with examples for how to run with storage and sorting if desired.

@danlu1 What do you think? If this sounds reasonable I can start tinkering around on it later this week.

danlu1 commented 2 years ago

May I ask what do you mean by "Synapse storage"? Do you mean template registration?

avanlinden commented 2 years ago

@danlu1 The step where the excel file generated by the script is stored to synapse with a syn.store() command. It's super helpful to have that happen automatically as part of running the script to make the template!

danlu1 commented 2 years ago

Gotcha! I agree that we should combine them into one. I generated the sorted one since I founded that after running the synRestPOST, the order of keys changed but not in a pattern that I can tell of (not as the order in json or in alphabetical order). I know what happened. I renamed the sorted code in github directly by adding the "sorted" that accidently remove the old version of create_template_from_Syn_schema.py that only contains the functionality to syn.store() the template. It turns out the code has been messed up a bit. Let me fix this.

avanlinden commented 2 years ago

Ah ok, that makes sense. We should have:

Also this is not a high priority so no rush.

thomasyu888 commented 2 years ago

We are looking to add the schemas API functionality into the python client - we have a draft PR open. It can be used if you install this branch: https://github.com/Sage-Bionetworks/synapsePythonClient/pull/894.

It would be helpful for more people to test the feature before we release the feature.

danlu1 commented 2 years ago

Sure. I can try to use it.