Closed DriesSchaumont closed 2 months ago
On ref (main): 7d48ed707a295e659bcf0d5f13f4c55ebc967d8d Pipeline process_samples
process_samples
executor > local (5) [14/c13bab] process > test_wf5:move_layer:process... [100%] 1 of 1 ✔ [b8/f333a4] process > test_wf5:process_samples:ru... [100%] 1 of 1 ✔ [4f/a02532] process > test_wf5:process_samples:ru... [100%] 1 of 1 ✔ [- ] process > test_wf5:process_samples:ru... - [57/2cec88] process > test_wf5:process_samples:ru... [100%] 1 of 1, failed: 1 ✘ [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - [- ] process > test_wf5:process_samples:ru... - After splitting modalities: [5k_human_antiCMV_T_TBNK_select_layer, [output:/home/runner/work/openpipeline/openpipeline/work/4f/a02532cc5ae05cb51deaa5f08fed4e/5k_human_antiCMV_T_TBNK_select_layer.split_modalities_component.output, highly_variable_features_var_output:filter_with_hvg, highly_variable_features_obs_batch_key:sample_id, mitochondrial_gene_regex:^[mM][tT]-, top_n_vars:[50, 100, 200, 500], pca_overwrite:false, id:5k_human_antiCMV_T_TBNK_select_layer, input:/home/runner/work/openpipeline/openpipeline/work/4f/a02532cc5ae05cb51deaa5f08fed4e/5k_human_antiCMV_T_TBNK_select_layer.split_modalities_component.output/5k_human_antiCMV_T_TBNK_select_layer.add_id.output_rna.h5mu, rna_layer:test_layer, rna_min_counts:2, rna_max_counts:1000000, rna_min_genes_per_cell:1, rna_max_genes_per_cell:1000000, rna_min_cells_per_gene:1, rna_min_fraction_mito:0.0, rna_max_fraction_mito:1.0, var_name_mitochondrial_genes:mitochondrial, obs_name_mitochondrial_fraction:fraction_mitochondrial, workflow_output:$id.$key.output.h5mu, var_qc_metrics:filter_with_hvg,mitochondrial, modality:rna]] After splitting modalities: [5k_human_antiCMV_T_TBNK_select_layer, [output:/home/runner/work/openpipeline/openpipeline/work/4f/a02532cc5ae05cb51deaa5f08fed4e/5k_human_antiCMV_T_TBNK_select_layer.split_modalities_component.output, highly_variable_features_var_output:filter_with_hvg, highly_variable_features_obs_batch_key:sample_id, mitochondrial_gene_regex:^[mM][tT]-, top_n_vars:[50, 100, 200, 500], pca_overwrite:false, id:5k_human_antiCMV_T_TBNK_select_layer, input:/home/runner/work/openpipeline/openpipeline/work/4f/a02532cc5ae05cb51deaa5f08fed4e/5k_human_antiCMV_T_TBNK_select_layer.split_modalities_component.output/5k_human_antiCMV_T_TBNK_select_layer.add_id.output_gdo.h5mu, rna_layer:test_layer, rna_min_counts:2, rna_max_counts:1000000, rna_min_genes_per_cell:1, rna_max_genes_per_cell:1000000, rna_min_cells_per_gene:1, rna_min_fraction_mito:0.0, rna_max_fraction_mito:1.0, var_name_mitochondrial_genes:mitochondrial, obs_name_mitochondrial_fraction:fraction_mitochondrial, workflow_output:$id.$key.output.h5mu, var_qc_metrics:filter_with_hvg,mitochondrial, modality:gdo]] WARN: Key for module 'grep_annotation_column' is duplicated. WARN: Key for module 'calculate_qc_metrics' is duplicated. WARN: Key for module 'publish' is duplicated. WARN: Key for module 'pca' is duplicated. WARN: Key for module 'find_neighbors' is duplicated. WARN: Key for module 'umap' is duplicated. Error executing process > 'test_wf5:process_samples:run_wf:runEachWf:rna_singlesample:run_wf:qc:run_wf:grep_annotation_column:processWf:grep_annotation_column_process (5k_human_antiCMV_T_TBNK_select_layer)' Caused by: Process `test_wf5:process_samples:run_wf:runEachWf:rna_singlesample:run_wf:qc:run_wf:grep_annotation_column:processWf:grep_annotation_column_process (5k_human_antiCMV_T_TBNK_select_layer)` terminated with an error exit status (1) Command executed: # meta exports # export VIASH_META_RESOURCES_DIR="/home/runner/work/openpipeline/openpipeline/target/nextflow/metadata/grep_annotation_column" export VIASH_META_RESOURCES_DIR=".viash_meta_resources" export VIASH_META_TEMP_DIR="/tmp" export VIASH_META_FUNCTIONALITY_NAME="grep_annotation_column" # export VIASH_META_EXECUTABLE="$VIASH_META_RESOURCES_DIR/$VIASH_META_FUNCTIONALITY_NAME" export VIASH_META_CONFIG="$VIASH_META_RESOURCES_DIR/.config.vsh.yaml" export VIASH_META_CPUS=1 export VIASH_META_MEMORY_B=5368709120 if [ ! -z ${VIASH_META_MEMORY_B+x} ]; then export VIASH_META_MEMORY_KB=$(( ($VIASH_META_MEMORY_B+1023) / 1024 )) export VIASH_META_MEMORY_MB=$(( ($VIASH_META_MEMORY_KB+1023) / 1024 )) export VIASH_META_MEMORY_GB=$(( ($VIASH_META_MEMORY_MB+1023) / 1024 )) export VIASH_META_MEMORY_TB=$(( ($VIASH_META_MEMORY_GB+1023) / 1024 )) export VIASH_META_MEMORY_PB=$(( ($VIASH_META_MEMORY_TB+1023) / 1024 )) fi # meta synonyms export VIASH_TEMP="$VIASH_META_TEMP_DIR" export TEMP_DIR="$VIASH_META_TEMP_DIR" # create output dirs if need be function mkdir_parent { for file in "$@"; do mkdir -p "$(dirname "$file")" done } mkdir_parent "5k_human_antiCMV_T_TBNK_select_layer.grep_annotation_column.output.h5mu" # argument exports export VIASH_PAR_INPUT="_viash_par/input_1/5k_human_antiCMV_T_TBNK_select_layer.add_id.output_rna.h5mu" export VIASH_PAR_INPUT_LAYER="test_layer" export VIASH_PAR_MODALITY="rna" export VIASH_PAR_MATRIX="var" export VIASH_PAR_OUTPUT="5k_human_antiCMV_T_TBNK_select_layer.grep_annotation_column.output.h5mu" export VIASH_PAR_OUTPUT_MATCH_COLUMN="mitochondrial" export VIASH_PAR_OUTPUT_FRACTION_COLUMN="fraction_mitochondrial" export VIASH_PAR_REGEX_PATTERN="^[mM][tT]-" # process script set -e tempscript=".viash_script.sh" cat > "$tempscript" << VIASHMAIN import mudata as mu from pathlib import Path from operator import attrgetter, itemgetter from pandas import Series import re import numpy as np ### VIASH START # The following code has been auto-generated by Viash. par = { 'input': $( if [ ! -z ${VIASH_PAR_INPUT+x} ]; then echo "r'${VIASH_PAR_INPUT//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'input_column': $( if [ ! -z ${VIASH_PAR_INPUT_COLUMN+x} ]; then echo "r'${VIASH_PAR_INPUT_COLUMN//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'input_layer': $( if [ ! -z ${VIASH_PAR_INPUT_LAYER+x} ]; then echo "r'${VIASH_PAR_INPUT_LAYER//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'modality': $( if [ ! -z ${VIASH_PAR_MODALITY+x} ]; then echo "r'${VIASH_PAR_MODALITY//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'matrix': $( if [ ! -z ${VIASH_PAR_MATRIX+x} ]; then echo "r'${VIASH_PAR_MATRIX//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'output': $( if [ ! -z ${VIASH_PAR_OUTPUT+x} ]; then echo "r'${VIASH_PAR_OUTPUT//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'output_compression': $( if [ ! -z ${VIASH_PAR_OUTPUT_COMPRESSION+x} ]; then echo "r'${VIASH_PAR_OUTPUT_COMPRESSION//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'output_match_column': $( if [ ! -z ${VIASH_PAR_OUTPUT_MATCH_COLUMN+x} ]; then echo "r'${VIASH_PAR_OUTPUT_MATCH_COLUMN//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'output_fraction_column': $( if [ ! -z ${VIASH_PAR_OUTPUT_FRACTION_COLUMN+x} ]; then echo "r'${VIASH_PAR_OUTPUT_FRACTION_COLUMN//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'regex_pattern': $( if [ ! -z ${VIASH_PAR_REGEX_PATTERN+x} ]; then echo "r'${VIASH_PAR_REGEX_PATTERN//\'/\'\"\'\"r\'}'"; else echo None; fi ) } meta = { 'functionality_name': $( if [ ! -z ${VIASH_META_FUNCTIONALITY_NAME+x} ]; then echo "r'${VIASH_META_FUNCTIONALITY_NAME//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'resources_dir': $( if [ ! -z ${VIASH_META_RESOURCES_DIR+x} ]; then echo "r'${VIASH_META_RESOURCES_DIR//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'executable': $( if [ ! -z ${VIASH_META_EXECUTABLE+x} ]; then echo "r'${VIASH_META_EXECUTABLE//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'config': $( if [ ! -z ${VIASH_META_CONFIG+x} ]; then echo "r'${VIASH_META_CONFIG//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'temp_dir': $( if [ ! -z ${VIASH_META_TEMP_DIR+x} ]; then echo "r'${VIASH_META_TEMP_DIR//\'/\'\"\'\"r\'}'"; else echo None; fi ), 'cpus': $( if [ ! -z ${VIASH_META_CPUS+x} ]; then echo "int(r'${VIASH_META_CPUS//\'/\'\"\'\"r\'}')"; else echo None; fi ), 'memory_b': $( if [ ! -z ${VIASH_META_MEMORY_B+x} ]; then echo "int(r'${VIASH_META_MEMORY_B//\'/\'\"\'\"r\'}')"; else echo None; fi ), 'memory_kb': $( if [ ! -z ${VIASH_META_MEMORY_KB+x} ]; then echo "int(r'${VIASH_META_MEMORY_KB//\'/\'\"\'\"r\'}')"; else echo None; fi ), 'memory_mb': $( if [ ! -z ${VIASH_META_MEMORY_MB+x} ]; then echo "int(r'${VIASH_META_MEMORY_MB//\'/\'\"\'\"r\'}')"; else echo None; fi ), 'memory_gb': $( if [ ! -z ${VIASH_META_MEMORY_GB+x} ]; then echo "int(r'${VIASH_META_MEMORY_GB//\'/\'\"\'\"r\'}')"; else echo None; fi ), 'memory_tb': $( if [ ! -z ${VIASH_META_MEMORY_TB+x} ]; then echo "int(r'${VIASH_META_MEMORY_TB//\'/\'\"\'\"r\'}')"; else echo None; fi ), 'memory_pb': $( if [ ! -z ${VIASH_META_MEMORY_PB+x} ]; then echo "int(r'${VIASH_META_MEMORY_PB//\'/\'\"\'\"r\'}')"; else echo None; fi ) } dep = { } ### VIASH END # START TEMPORARY WORKAROUND setup_logger # reason: resources aren't available when using Nextflow fusion # from setup_logger import setup_logger def setup_logger(): import logging from sys import stdout logger = logging.getLogger() logger.setLevel(logging.INFO) console_handler = logging.StreamHandler(stdout) logFormatter = logging.Formatter("%(asctime)s %(levelname)-8s %(message)s") console_handler.setFormatter(logFormatter) logger.addHandler(console_handler) return logger # END TEMPORARY WORKAROUND setup_logger logger = setup_logger() def main(par): input_file, output_file, mod_name = Path(par["input"]), Path(par["output"]), par['modality'] logger.info(f"Compiling regular expression '{par['regex_pattern']}'.") try: compiled_regex = re.compile(par["regex_pattern"]) except (TypeError, re.error) as e: raise ValueError(f"{par['regex_pattern']} is not a valid regular expression pattern.") from e else: if compiled_regex.groups: raise NotImplementedError("Using match groups is not supported by this component.") logger.info('Reading input file %s, modality %s.', input_file, mod_name) mudata = mu.read_h5mu(input_file) modality_data = mudata[mod_name] logger.info("Reading input file done.") logger.info("Using annotation dataframe '%s'.", par["matrix"]) annotation_matrix = getattr(modality_data, par['matrix']) default_column = { "var": attrgetter("var_names"), "obs": attrgetter("obs_names") } if par["input_column"]: logger.info("Input column '%s' was specified.", par["input_column"]) try: annotation_column = annotation_matrix[par["input_column"]] except KeyError as e: raise ValueError(f"Column {par['input_column']} could not be found for modality " f"{par['modality']}. Available columns:" f" {','.join(annotation_matrix.columns.to_list())}") from e else: logger.info(f"No input column specified, using '.{par['matrix']}_names'") annotation_column = default_column[par['matrix']](modality_data).to_series() logger.info("Applying regex search.") grep_result = annotation_column.str.contains(par["regex_pattern"], regex=True) logger.info("Search results: %s", grep_result.value_counts()) other_axis_attribute = { "var": "obs", "obs": "var" } if par['output_fraction_column']: logger.info("Enabled writing the fraction of values that matches to the pattern.") input_layer = modality_data.X if not par["input_layer"] else modality_data.layers[par["input_layer"]] pct_matching = np.ravel(np.sum(input_layer[:, grep_result], axis=1) / np.sum(input_layer, axis=1)) assert ((pct_matching >= 0) & (pct_matching <= 1)).all(), \\ "Fractions are not within bounds, please report this as a bug" logger.info("Fraction statistics: \\n%s", Series(pct_matching).describe()) output_matrix = other_axis_attribute[par['matrix']] logger.info("Writing fractions to matrix '%s', column '%s'", output_matrix, par['output_fraction_column']) getattr(modality_data, output_matrix)[par['output_fraction_column']] = pct_matching logger.info("Adding values that matched the pattern to '%s', column '%s'", par["matrix"], par["output_match_column"]) getattr(modality_data, par['matrix'])[par["output_match_column"]] = grep_result logger.info("Writing out data to '%s' with compression '%s'.", output_file, par["output_compression"]) mudata.write(output_file, compression=par["output_compression"]) if __name__ == "__main__": main(par) VIASHMAIN python -B "$tempscript" Command exit status: 1 Command output: 2024-02-28 03:09:34,533 INFO Compiling regular expression '^[mM][tT]-'. 2024-02-28 03:09:34,534 INFO Reading input file _viash_par/input_1/5k_human_antiCMV_T_TBNK_select_layer.add_id.output_rna.h5mu, modality rna. 2024-02-28 03:09:34,742 INFO Reading input file done. 2024-02-28 03:09:34,742 INFO Using annotation dataframe 'var'. 2024-02-28 03:09:34,742 INFO No input column specified, using '.var_names' 2024-02-28 03:09:34,742 INFO Applying regex search. 2024-02-28 03:09:34,744 INFO Search results: gene_ids False 5594 Name: count, dtype: int64 2024-02-28 03:09:34,744 INFO Enabled writing the fraction of values that matches to the pattern. Command error: Unable to find image 'ghcr.io/openpipelines-bio/metadata_grep_annotation_column:integration_build' locally integration_build: Pulling from openpipelines-bio/metadata_grep_annotation_column e1caac4eb9d2: Already exists 51d1f07906b7: Already exists fe87ad6b112e: Already exists 4d8ccb72bbad: Already exists 8100581c78dd: Already exists 695bd04ff41a: Pulling fs layer 747b895a4e23: Pulling fs layer 695bd04ff41a: Download complete 695bd04ff41a: Pull complete 747b895a4e23: Verifying Checksum 747b895a4e23: Download complete 747b895a4e23: Pull complete Digest: sha256:1c2f660a214ea0c439381ca5ab559be94b037bb2d3a2a80545a0b0901825ea75 Status: Downloaded newer image for ghcr.io/openpipelines-bio/metadata_grep_annotation_column:integration_build .viash_script.sh:104: RuntimeWarning: invalid value encountered in divide pct_matching = np.ravel(np.sum(input_layer[:, grep_result], axis=1) / np.sum(input_layer, axis=1)) Traceback (most recent call last): File ".viash_script.sh", line 120, in <module> main(par) File ".viash_script.sh", line 105, in main assert ((pct_matching >= 0) & (pct_matching <= 1)).all(), \ AssertionError: Fractions are not within bounds, please report this as a bug Work dir: /home/runner/work/openpipeline/openpipeline/work/57/2cec8881b6197de4fd9a464f36209a Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
On ref (main): 7d48ed707a295e659bcf0d5f13f4c55ebc967d8d Pipeline
process_samples