pepkit / looper

A job submitter for Portable Encapsulated Projects
http://looper.databio.org
BSD 2-Clause "Simplified" License
20 stars 7 forks source link

looper destroy is broken #468

Closed nsheff closed 3 weeks ago

nsheff commented 4 months ago

Can't looper destroy something.

Related to #459. and also #464 -- these things should have been caught by testing.

looper destroy --sel-incl demo4
Looper version: 1.7.0
Command: destroy
Using looper config (.looper.yaml).
Config file does not have version key. Defaulting to 2.1.0
Config file does not have version key. Defaulting to 2.1.0
Removing results:
Removing summary:
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialized looper config file: /home/nsheff/code/seqcolapi/analysis/demo/looper_pipestat_config.yaml
Initialize PipestatBackend
Initialize FileBackend
Traceback (most recent call last):
  File "/home/nsheff/.local/bin/looper", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/nsheff/.local/lib/python3.11/site-packages/looper/cli_looper.py", line 747, in main
    return Destroyer(prj)(args)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/nsheff/.local/lib/python3.11/site-packages/looper/looper.py", line 262, in __call__
    destroy_summary(self.prj, args.dry_run, args.project)
  File "/home/nsheff/.local/lib/python3.11/site-packages/looper/looper.py", line 714, in destroy_summary
    pipeline_name=psm["_pipeline_name"],
                  ~~~^^^^^^^^^^^^^^^^^^
  File "/home/nsheff/.local/lib/python3.11/site-packages/pipestat/pipestat.py", line 278, in __getitem__
    result = self.retrieve_one(record_identifier=key)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nsheff/.local/lib/python3.11/site-packages/pipestat/pipestat.py", line 101, in inner
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nsheff/.local/lib/python3.11/site-packages/pipestat/pipestat.py", line 713, in retrieve_one
    raise RecordNotFoundError(f"Record '{record_identifier}' not found")
pipestat.exceptions.RecordNotFoundError: Record '_pipeline_name' not found
donaldcampbelljr commented 4 months ago

I see where this went wrong and will add a partial fix shortly. Basically, we changed pipestat's item access behavior last fall. psm[_pipeline_name] should change to psm.pipeline_name

What is Destroyer currently doing? For each sample, the destroy function will check the sample output folder, e.g. `results/results_pipeline/frog_1' and destroy that folder. It will not touch the submission folder.

It will not destroy the results.yaml created by pipestat. Thus, currently, selecting a specific sample will not remove it from the results.yaml file.

Destroyer goes on to call destroy_summary: It appears as though the destroy_summary function only destroys the summary files:

To that point, we should make sure it will also destroy the aggregate_results.yaml produced when running results to multiple results files (e.g. currently how PEPATAC operates).

Also, there do not appear to be any tests for Destroyer functionality. So, it makes sense that this broke without us knowing.

donaldcampbelljr commented 3 months ago

I've implemented the fix for this with the above merged PR.