CCBR / RENEE

A comprehensive quality-control and quantification RNA-seq pipeline
https://CCBR.github.io/RENEE/
MIT License
4 stars 4 forks source link

Refactor the renee python package and move the GUI here #94

Closed kelly-sovacool closed 3 months ago

kelly-sovacool commented 10 months ago

Changes

Moves the renee GUI from the ccbrpipeliner repo to this one (see src/renee/gui.py). The launch_gui() function (previously the main function of the GUI script in ccbrpipeliner) calls run() directly instead of using subprocess. Now the gui is a subcommand of renee and is launched by running renee gui. Various static global variables and environmental variables are now setup dynamically with helper functions.

Also:

testing

To test the GUI, switch to this branch and run ./bin/renee gui.

Issues

PR Checklist

(~Strikethrough~ any points that are not applicable.)

kelly-sovacool commented 3 months ago

need to wait for package structure changes from https://github.com/CCBR/RENEE/pull/139 before we can continue with this

Update: it's merged now

kopardev commented 3 months ago

@samarth8392 .. Looks good to me ... can you please re-look and approve?

samarth8392 commented 3 months ago

I tried testing the gui using ./bin/renee gui and doing a dryrun. I get the following error.

Take a look here: https://github.com/CCBR/RENEE/blob/82478010e7931a33c14e7e352ce9292bce70fe57/src/renee/gui.py#L178

/data/CCBR_Pipeliner/db/PipeDB/Conda/envs/py311/lib/python3.11/tkinter/__init__.py:1531: ResourceWarning: unclosed <socket.socket fd=5, family=2, type=1, proto=0, laddr=('0.0.0.0', 0)>
  for k, v in cnf.items():
ResourceWarning: Enable tracemalloc to get the object allocation traceback
window created!
--SUBMIT-- {'--INDIR--': '/data/mathurs2/input', 'Browse': '/data/mathurs2/input', '--OUTDIR--': '/data/mathurs2/output', 'Browse0': '', '--ANNOTATION--': 'hg38_45'}
['/data/mathurs2/input/sample1-tumor.R2.fastq.gz', '/data/mathurs2/input/sample1-normal.R2.fastq.gz', '/data/mathurs2/input/sample1-normal.R1.fastq.gz', '/data/mathurs2/input/sample1-tumor.R1.fastq.gz']
Traceback (most recent call last):
  File "/vf/users/CCBR/projects/techDev/RENEE/main.py", line 16, in <module>
    sys.exit(main())
             ^^^^^^
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/__main__.py", line 1479, in main
    args.func(args)
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/gui.py", line 178, in launch_gui
    tmp_dir=get_tmp_dir(),
            ^^^^^^^^^^^^^
TypeError: get_tmp_dir() missing 2 required positional arguments: 'tmp_dir' and 'outdir'
kelly-sovacool commented 3 months ago

I tried testing the gui using ./bin/renee gui and doing a dryrun. I get the following error.

Take a look here:

https://github.com/CCBR/RENEE/blob/82478010e7931a33c14e7e352ce9292bce70fe57/src/renee/gui.py#L178

/data/CCBR_Pipeliner/db/PipeDB/Conda/envs/py311/lib/python3.11/tkinter/__init__.py:1531: ResourceWarning: unclosed <socket.socket fd=5, family=2, type=1, proto=0, laddr=('0.0.0.0', 0)>
  for k, v in cnf.items():
ResourceWarning: Enable tracemalloc to get the object allocation traceback
window created!
--SUBMIT-- {'--INDIR--': '/data/mathurs2/input', 'Browse': '/data/mathurs2/input', '--OUTDIR--': '/data/mathurs2/output', 'Browse0': '', '--ANNOTATION--': 'hg38_45'}
['/data/mathurs2/input/sample1-tumor.R2.fastq.gz', '/data/mathurs2/input/sample1-normal.R2.fastq.gz', '/data/mathurs2/input/sample1-normal.R1.fastq.gz', '/data/mathurs2/input/sample1-tumor.R1.fastq.gz']
Traceback (most recent call last):
  File "/vf/users/CCBR/projects/techDev/RENEE/main.py", line 16, in <module>
    sys.exit(main())
             ^^^^^^
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/__main__.py", line 1479, in main
    args.func(args)
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/gui.py", line 178, in launch_gui
    tmp_dir=get_tmp_dir(),
            ^^^^^^^^^^^^^
TypeError: get_tmp_dir() missing 2 required positional arguments: 'tmp_dir' and 'outdir'

Thanks for reporting this! I believe I've fixed it now

samarth8392 commented 3 months ago

Thanks for updating the two util functions. Now the error is coming from https://github.com/CCBR/RENEE/blob/9e1266012149f8983d89263753e46f23def32372/src/renee/run.py#L59 and https://github.com/CCBR/RENEE/blob/9e1266012149f8983d89263753e46f23def32372/src/renee/gui.py#L185

I think --wait must be added to the run_args ??

Traceback (most recent call last):
  File "/vf/users/CCBR/projects/techDev/RENEE/main.py", line 16, in <module>
    sys.exit(main())
             ^^^^^^
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/__main__.py", line 1477, in main
    args.func(args)
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/gui.py", line 185, in launch_gui
    allout = run_in_context(run_args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/gui.py", line 243, in run_in_context
    run(args)
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/run.py", line 59, in run
    if sub_args.wait:
       ^^^^^^^^^^^^^
AttributeError: 'Namespace' object has no attribute 'wait'
kelly-sovacool commented 3 months ago

ok, I'm working through a couple more errors after this one, will let you know when it's working on my end

kelly-sovacool commented 3 months ago

Yes I'm working on going through and adding all the sub_args. I thought those with defaults would be automatically populated, but I guess that only happens via the CLI.

samarth8392 commented 3 months ago

Thanks Kelly for fixing those errors, now it's related to SIF cache dir.

Traceback (most recent call last):
  File "/vf/users/CCBR/projects/techDev/RENEE/main.py", line 16, in <module>
    sys.exit(main())
             ^^^^^^
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/__main__.py", line 1477, in main
    args.func(args)
  File "/vf/users/CCBR/projects/techDev/RENEE/src/renee/gui.py", line 180, in launch_gui
    singularity_cache=os.environ["SINGULARITY_CACHEDIR"],
                      ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen os>", line 679, in __getitem__
KeyError: 'SINGULARITY_CACHEDIR'
samarth8392 commented 3 months ago

Yeyy! The dryrun works now. I didn't submit a full run but I don't think there should be any issues there. If this is good enough, send me a 👍 and I will approve the PR and merge the branches.

kelly-sovacool commented 3 months ago

Yeyy! The dryrun works now. I didn't submit a full run but I don't think there should be any issues there. If this is good enough, send me a 👍 and I will approve the PR and merge the branches.

I think it is worthwhile to try submitting a full run just in case, since we don't have any unit tests for that section of the code

samarth8392 commented 3 months ago

I just submitted a full test run, all the rules worked file but jobby failed.

run_jobby_on_snakemake_log: command not found

I think this is a RENEE issue because the file is in resources folder so it should be os.path.join('resources', 'run_jobby_on_snakemake_log'

https://github.com/CCBR/RENEE/blob/2c930d7f28de079b70664568175c910c7f540d58/workflow/Snakefile#L232

also for spooker

run_jobby_on_snakemake_log logfiles/snakemake.log | tee logfiles/snakemake.log.jobby | cut -f2,3,18 > logfiles/snakemake.log.jobby.short
/usr/bin/bash: run_jobby_on_snakemake_log: command not found
CalledProcessError in file /data/BandayLab/CABO-exome-seq-Andrea/mathurs2/ccbr1332/test/output/workflow/Snakefile, line 241:
Command 'set -euo pipefail;  run_jobby_on_snakemake_log logfiles/snakemake.log | tee logfiles/snakemake.log.jobby | cut -f2,3,18 > logfiles/snakemake.log.jobby.short' returned non-zero exit status 127.
  File "/data/BandayLab/CABO-exome-seq-Andrea/mathurs2/ccbr1332/test/output/workflow/Snakefile", line 241, in __onsuccess
OnSuccess
run_jobby_on_snakemake_log logfiles/snakemake.log | tee logfiles/snakemake.log.jobby | cut -f2,3,18 > logfiles/snakemake.log.jobby.short
kelly-sovacool commented 3 months ago

I just submitted a full test run, all the rules worked file but jobby failed.

run_jobby_on_snakemake_log: command not found

I think this is a RENEE issue because the file is in resources folder so it should be os.path.join('resources', 'run_jobby_on_snakemake_log'

run_jobby_on_snakemake_log logfiles/snakemake.log | tee logfiles/snakemake.log.jobby | cut -f2,3,18 > logfiles/snakemake.log.jobby.short
/usr/bin/bash: run_jobby_on_snakemake_log: command not found
CalledProcessError in file /data/BandayLab/CABO-exome-seq-Andrea/mathurs2/ccbr1332/test/output/workflow/Snakefile, line 241:
Command 'set -euo pipefail;  run_jobby_on_snakemake_log logfiles/snakemake.log | tee logfiles/snakemake.log.jobby | cut -f2,3,18 > logfiles/snakemake.log.jobby.short' returned non-zero exit status 127.
  File "/data/BandayLab/CABO-exome-seq-Andrea/mathurs2/ccbr1332/test/output/workflow/Snakefile", line 241, in __onsuccess
OnSuccess
run_jobby_on_snakemake_log logfiles/snakemake.log | tee logfiles/snakemake.log.jobby | cut -f2,3,18 > logfiles/snakemake.log.jobby.short

This is expected behavior if you did not run module load ccbrpipeliner before submitting the run.

We are no longer using the jobby file in the resources directory, instead we'd like to have "single source" of jobby (and spooker) which is added to the path when you load ccbrpipeliner.

samarth8392 commented 3 months ago

This is expected behavior if you did not run module load ccbrpipeliner before submitting the run.

We are no longer using the jobby file in the resources directory, instead we'd like to have "single source" of jobby (and spooker) which is added to the path when you load ccbrpipeliner.

Okay! Thanks!! In that case the test run was successful and I can now approve the PR. Great job @kelly-sovacool :)

kelly-sovacool commented 3 months ago

Go team!