PavlidisLab / Gemma

Genomics data re-analysis
Apache License 2.0
23 stars 6 forks source link

Make CLI logging when bulk-processing more friendly (QOL) #633

Open ppavlidis opened 1 year ago

ppavlidis commented 1 year ago

Suggestions for the experiment or arrayDesign CLI framework, when processing multiple entities:

The current format is like

ExpressionExperiment Id=3911 Name=Gene Experssion Profiling-Based Identification of Molecular Subtypes in Stage IV Melanoma with Different Clinical Outcome (test set) (GSE22153):
        Missing values not tolerated in design matrix

Instead something like:

GSE22153 3911 <name> <error message>
arteymix commented 1 year ago

I'd advocate instead for a tabular output for bulk-processed datasets via an option and maybe a companion flag that turns the standard output into a tabular output. I think we would also have to ensure that the logs are sent to stderr to make it work with cut & al.

I have a branch that separates the bulk processing features from AbstractCLI in a AbstractBatchProcessingCLI. That would allow us to put more tailored features without polluting all the tools.

ppavlidis commented 1 year ago

Sounds good