get_stat should use pipestat's retrieve_one

          Currently, per the tutorial, pepatac will create a separate `stats.yaml` for each of the input samples.

results_pipeline
  |__Tutorial1
       |___stats.yaml
  |__Tutorial2
       |___stats.yaml

This is problematic for using pipestat in the looper_config file which is necessary for looper report and looper link. This is because we can currently only choose one pipestat results file in the looper config.

Spawning separate stats files is default pypiper behavior that can be overridden using the pipestat_results_file parameter.

This allows for specifiying a single results file for the pipeline output:

PEPATAC:
  project: {}
  sample:
    tutorial1:
      File_mb: 27
      pipestat_created_time: '2023-11-20 16:56:32'
      pipestat_modified_time: '2023-11-20 16:56:44'
      Read_type: paired
      Genome: hg38
      Raw_reads: '1000000'
      Fastq_reads: 1000000
      Trimmed_reads: 1000000
      FastQC report r1:
        path: /home/drc/pepatac_tutorial/tools/pepatac/examples/tutorial/home/drc/pepatac_tutorial/processed/results_pipeline/tutorial1/fastq/tutorial1_R1_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r1
        annotation: PEPATAC
      FastQC report r2:
        path: /home/drc/pepatac_tutorial/tools/pepatac/examples/tutorial/home/drc/pepatac_tutorial/processed/results_pipeline/tutorial1/fastq/tutorial1_R2_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r2
        annotation: PEPATAC
      Aligned_reads_rCRSd: 99360.0
      Alignment_rate_rCRSd: 9.94
    tutorial2:
      File_mb: 27
      pipestat_created_time: '2023-11-20 16:58:02'
      pipestat_modified_time: '2023-11-20 16:58:12'
      Read_type: paired
      Genome: hg38
      Raw_reads: '1000000'
      Fastq_reads: 1000000
      Trimmed_reads: 1000000
      FastQC report r1:
        path: /home/drc/pepatac_tutorial/tools/pepatac/examples/tutorial/home/drc/pepatac_tutorial/processed/results_pipeline/tutorial2/fastq/tutorial2_R1_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r1
        annotation: PEPATAC
      FastQC report r2:
        path: /home/drc/pepatac_tutorial/tools/pepatac/examples/tutorial/home/drc/pepatac_tutorial/processed/results_pipeline/tutorial2/fastq/tutorial2_R2_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r2
        annotation: PEPATAC
      Aligned_reads_rCRSd: 100556.0
      Alignment_rate_rCRSd: 10.06

This works well until the pipeline attempts to retrieve a stat via pm.get_stat. When it attempts to retrieve a result from a file that contains more than one samples, it errors.

Missing stat 'Raw_reads'
Traceback (most recent call last):
  File "/home/drc/pepatac_tutorial//tools/pepatac/pipelines/pepatac.py", line 2784, in <module>
    sys.exit(main())
  File "/home/drc/pepatac_tutorial//tools/pepatac/pipelines/pepatac.py", line 1117, in main
    pm.run([cmd, cmd2], rmdup_bam, follow=check_alignment_genome)
  File "/home/drc/GITHUB/pepatac/pepatac/venv/lib/python3.10/site-packages/pypiper/manager.py", line 1093, in run
    call_follow()
  File "/home/drc/GITHUB/pepatac/pepatac/venv/lib/python3.10/site-packages/pypiper/manager.py", line 947, in call_follow
    follow()
  File "/home/drc/pepatac_tutorial//tools/pepatac/pipelines/pepatac.py", line 1106, in check_alignment_genome
    rr = float(pm.get_stat("Raw_reads"))
TypeError: float() argument must be a string or a real number, not 'NoneType'

I believe the solution is to have pypiper instead use pipestat's retrieve_one. Perhaps get_stat can be a wrapper for this.

Originally posted by @donaldcampbelljr in https://github.com/databio/pepatac/issues/257#issuecomment-1819897535

databio / pypiper

get_stat should use pipestat's retrieve_one #202