Closed stevekm closed 7 years ago
started documenting more thorough sample filepath retrieval method here: https://github.com/NYU-Molecular-Pathology/snsxt/blob/961be0f27dfdfd3beed497dc9fcc477ebecd7f62/snsxt/sns_tasks/_DemoQsubSampleTask.py#L38
also need to reconsider consistency between sample filepath retrieval methods that return a list vs. file function methods that use character string as input, also situations where multiple files might be returned or needed from a single step. Might need to enforce filepath lists more globally throughout the program and submodules.
updated sample filepath retrieval method here https://github.com/NYU-Molecular-Pathology/snsxt/blob/529cac27822869127f36e4449bacd33e3232dfe6/snsxt/sns_tasks/_DemoQsubSampleTask.py#L37
Currently, files for a given sample for a given step in the analysis pipeline are retrieved through filename pattern matching as per here:
This may not be specific enough to prevent matching the wrong file(s) if two or more samples in an analysis have similar names, such as
Need to revise this to do a more exact search to prevent the possibility of mis-matches. Consider creating an 'expected' exact filename and doing a search for an exact match. Alternatively, consider using
samples.*.csv
files output bysns
with paths to expected files, or record the output paths of files insnsxt
analysis steps for later retrieval.Also for reference, the file retrieval class method and
find
module