eastgenomics / eggd_ici_uploader

MIT License
0 stars 0 forks source link

V1.0.0 Release PR #4

Closed spaul91 closed 9 hours ago

spaul91 commented 2 days ago

This PR contains the commit intended to be packaged as the release of V1.0.0 of eggd_ici_uploader


This change is Reviewable

spaul91 commented 1 day ago

README.md line 32 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Can you add something here how we find the workflow id and what this does?

See "# What data are required for this app to run?" for the definition

spaul91 commented 1 day ago

README.md line 35 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
could you give an example of the run cmd with a fake project id.

Why specifically the project ID?

spaul91 commented 1 day ago

ici-uploader-image/dxasset.json line 7 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
is this instance type correct, it differs from the dxapp.json, is that because this is an asset?

This is the instance used to build the asset, not run it. It's invoked once

spaul91 commented 1 day ago

Dockerfile line 17 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
what are you trying to remove here? Could it be more explicit?

Intermediate install files. I've added a comment

spaul91 commented 1 day ago

src/script.sh line 47 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
255051004 this is the id for tumour of unknown origin write? Could we have this configurable as an input in-case we want to change the snomed id?

Would we ever want to change the SNOMED ID? This is essentially a hack to kick-start processing. If we make this configurable, we'll want to add a way to pass in new SNOMED IDs, and I can't think of a scenario where we'd want to apply the same ID to every sample unless it's just to get processing to begin. I doubt we'll want to swap "Tumour of Unknown Origin" to "Unknown Tumour", for example.

On the other hand, if we get to a point where we want to apply SNOMED IDs that are specific to each sample's tumour types, we'll need to completely rewrite this function (and likely other parts of the program) since the script's goal here will change.

spaul91 commented 1 day ago

src/script.sh line 70 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
I think it would be good to add some handling for multiple samplesheets here, as we don't want to be using a samplesheet if multiple exist in the top most directory.

I think this is something we should discuss as a group, since this has implications for the workflow config too

spaul91 commented 1 day ago

src/script.sh line 89 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
This is a nice feature!

Thanks :) it's a nice feature I forgot to give the user an opportunity to invoke because there's no user argument for it haha. If I add it I'll need to repeat all the testing... will decide if it's something I want to do

spaul91 commented 1 day ago

Dockerfile line 17 at r1 (raw file):

Previously, spaul91 (Sophie Paul) wrote…
Intermediate install files. I've added a comment

Done

spaul91 commented 1 day ago

dxapp.json line 12 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
can you remove your user name from this as our standards are to just have "org-emee_1"

Done.

spaul91 commented 1 day ago

dxapp.json line 57 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Is this a reasonable timeout period? We expect it to take 2 hours?

Reduced to 3 hours, to give us an extra hour of leeway just in case

spaul91 commented 1 day ago

dxapp.json line 57 at r1 (raw file):

Previously, spaul91 (Sophie Paul) wrote…
Reduced to 3 hours, to give us an extra hour of leeway just in case

Done

spaul91 commented 1 day ago

src/script.sh line 6 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
could you add some comments explaining the positional variables. i.e. $1 == Run path to the run folder in DNAnexus i.e. example_path/

I'm not so sure that this is non-obvious, and the google style guide recommends you don't add comments to functions that are short and obvious. That said, I didn't read the style guide before writing this script, so I should probably review the rest of the code and make sure it's compliant.

Is it the google style guide the one our group follows for bash scripts?

spaul91 commented 1 day ago

src/script.sh line 11 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Would be good to quote any path variables i.e. "${SAMPLESHEET_PATH}" or "${RUN_PATH}" see google style guide: https://google.github.io/styleguide/shellguide.html#quoting

Agreed, will go back and add them

spaul91 commented 1 day ago

Readme.md line 1 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
should this readme exist? Looks like it was accidentally committed?

removed

spaul91 commented 1 day ago

Readme.developer.md line 3 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
same for this, is it meant to be here? if so could you add some documentation?

removed

spaul91 commented 1 day ago

dxapp.json line 57 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Sorry to ask again but I think we should be a bit more liberal with the time out so maybe 4 hours?

Done.

spaul91 commented 1 day ago

README.md line 35 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
I just meant an example cmd with fake data. i.e ```bash dx run app-eggd_ici_uploader/1.0.0 \ -irun_folder="project-1234:tso500/output/" \ -iworkflow_id="1234-1234-1234" \ -iapi_key="" ``` Specifically showing the path we use for TSO500 currently so if anyone picked this up they would have a better idea how this works.

I'm sorry, but I respectfully disagree. It's a feature of DNANexus, and anyone using this app should be familiar with the way paths are specified. I think it's fine as it is, and I don't see what's ambiguous about the placeholders I'm using already

spaul91 commented 1 day ago

README.md line 32 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Maybe just adding some info to the # What data are required for this app to run?" on what this configures?

Isn't it enough as it is? The section tells you to get the workflow ID from ICI, and I don't think it should be the responsibility of these docs to outline all of the steps needed to define and copy a workflow ID from ICI.

I'm afraid I don't quite understand what you need from me on this, so if you know what to replace this with, please feel free to make the edits and open a PR against the dev branch.

spaul91 commented 1 day ago

src/script.sh line 6 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Yeah that's fair enough, I think it's quite clear for this one. I was quoting the style guide just for the best practice of quoting really, not sure if we have finalised a style guide for shell scripts ShellCheck is mentioned in the draft development guide so maybe follow that.

I've updated the script to follow SpellCheck conventions. I've left out the function comments because I don't think they're complex enough to warrant the verbiage

spaul91 commented 1 day ago

src/script.sh line 11 at r1 (raw file):

Previously, spaul91 (Sophie Paul) wrote…
Agreed, will go back and add them

Done

spaul91 commented 1 day ago

src/script.sh line 47 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
my thought would be if we then apply this to MYE as well as TSO500 there might be a use-case for running this with a different snomed CT ID. I agree would require a re-write if we want to add individual snomed ct ids.

I don't think so, SNOMED IDs should be specific to samples, but the way we're using them right now is just as a placeholder until something specific is decided on. This won't change across pipelines for the time being.

spaul91 commented 13 hours ago

src/script.sh line 45 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Might be useful to be explicit here i.e. "255051004" is the SNOMED CT ID..

Done.

spaul91 commented 11 hours ago

src/script.sh line 20 at r3 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
I think this should be "${DX_PROJECT_CONTEXT_ID}:${RUN_PATH}"

there must have been a reason I did this in this horrid, wacky way. Maybe ShellCheck threw a warning at me. I'll check

spaul91 commented 11 hours ago

README.md line 32 at r1 (raw file):

Previously, PolarisAstrum-NHS (Polaris Astrum) wrote…
I agree Sophie, getting the workflow ID seems like something that belongs in a confluence page rather than the readme file.

@RSWilson1 I think that's kind of confusing, it doesn't tell you where to look for the workflow ID (which is the important part, not all of the things it encapsulates). I'd suggest leaving it as it is imo, but again if you can think of a more concise and clear way to word it, I'd be more than happy to replace what's there now

spaul91 commented 11 hours ago

README.md line 35 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Yes and we should probably have this as a separate example for TSO500 as we might use this for MYE in the future.

Instructions will be in the user guide for running that specific workflow, I don't want to go down the route of adding examples for every pipeline in the app instructions themselves, because they should be general, not specific. As a compromise, would you accept me adding a single path for TSO500 and making it clear that it's an example?

spaul91 commented 11 hours ago

src/script.sh line 89 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
Did you decide on this? I think as we have made quite a few changes we should repeat testing with the final version anyway.

I'm being dumb. I did add it as a feature in dxapp.json, I just didn't put that in the README. I could go either way on testing the dry-run feature, it's only really for local testing. I'll be repeating testing either way since as you said we've changed a lot about this app at this point.

spaul91 commented 11 hours ago

src/script.sh line 20 at r3 (raw file):

Previously, spaul91 (Sophie Paul) wrote…
there must have been a reason I did this in this horrid, wacky way. Maybe ShellCheck threw a warning at me. I'll check

Nope, just me being silly. Thanks for catching this one

spaul91 commented 9 hours ago

src/script.sh line 70 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
yes let's discuss

Done - Discussed offline, non-issue

spaul91 commented 9 hours ago

README.md line 35 at r1 (raw file):

Previously, RSWilson1 (Robert Wilson) wrote…
I have found it helpful in the past to have some examples which just help make it clear how to use this of course we don't need an example for every assay or use-case. Yes let's add a single path for TSO500 as an example that way it's recorded somewhere :thumbsup:

Done.