Linted paper scripts and notebooks

GavinHuttley commented 1 week ago

Summary by Sourcery

Lint and refactor paper scripts and notebooks to enhance code quality and maintainability, and add comprehensive documentation for the 'diverse-seq' application.

Enhancements:

Lint and refactor paper scripts and notebooks to improve code quality and maintainability.

Documentation:

Add comprehensive documentation for the 'diverse-seq' application, detailing its features, usage, and performance.

sourcery-ai[bot] commented 1 week ago

Reviewer's Guide by Sourcery

This PR adds Python scripts and Jupyter notebooks for generating figures and analyzing results for the paper. The changes include scripts for benchmarking performance, evaluating clustering algorithms, and generating visualizations of the results.

Class diagram for new classes in jsd_v_dist.py

classDiagram
    class min_dist {
        -dists
        -num
        +__init__(dists)
        +__call__(names: list[str]) -> float
    }

    class compare_sets {
        -dist_size: int
        -dvgt
        +__init__(app, dist_size: int)
        +main(aln: c3_types.AlignedSeqsType) -> dict
    }

    class make_sample {
        -pool_sizes: dict
        -seq_len: int
        +main(num: int) -> c3_types.UnalignedSeqsType
    }

    class seqcoll_to_records {
        -k: int
        +main(seqs: c3_types.UnalignedSeqsType) -> list[KmerSeq]
    }

    class true_positive {
        -expected: set[str]
        -label2pool: callable
        -size: int
        -stat: str
        +main(records: list[KmerSeq]) -> bool
    }

Class diagram for new classes in synthetic_known.py

classDiagram
    class make_sample {
        -pool_sizes: dict
        -seq_len: int
        +main(num: int) -> c3_types.UnalignedSeqsType
    }

    class seqcoll_to_records {
        -k: int
        +main(seqs: c3_types.UnalignedSeqsType) -> list[KmerSeq]
    }

    class true_positive {
        -expected: set[str]
        -label2pool: callable
        -size: int
        -stat: str
        +main(records: list[KmerSeq]) -> bool
    }

    class eval_condition {
        -num_reps
        -k
        -repeats
        -pools
        +main(settings: tuple[str, str, int]) -> c3_types.TabularType
    }

File-Level Changes

Change	Details	Files
Added scripts for benchmarking and performance evaluation	Added benchmark.py for measuring execution time of different commands Added benchmark_ctree.py specifically for evaluating clustering tree performance Implemented TimeIt and TempWorkingDir utility classes for benchmarking	`paper/nbks/benchmark.py` `paper/nbks/benchmark_ctree.py`
Added scripts and notebooks for analyzing and visualizing results	Added Jupyter notebooks for analyzing JSD vs distance metrics Added notebook for demonstrating plugin usage Added notebook for analyzing benchmark results Added notebook for analyzing clustering tree results	`paper/nbks/jsd_v_dist.ipynb` `paper/nbks/plugin_demo.ipynb` `paper/nbks/benchmark.ipynb` `paper/nbks/ctree.ipynb`
Added supporting infrastructure for experiments	Added script for downloading required datasets Added project path utilities for managing file locations Added LaTeX files for generating algorithm figures Created directory structure for results and figures	`paper/nbks/get_data_sets.py` `paper/nbks/project_path.py` `paper/figs/max.tex` `paper/figs/nmost.tex`
Added scripts for clustering tree experiments	Added experiment.py for running clustering tree analysis Added iq_experiment.py for IQ-TREE comparisons Added likelihoods.py for computing tree likelihoods	`paper/nbks/ctree/experiment.py` `paper/nbks/ctree/iq_experiment.py` `paper/nbks/ctree/likelihoods.py`

Tips and commands

#### Interacting with Sourcery - **Trigger a new review:** Comment `@sourcery-ai review` on the pull request. - **Continue discussions:** Reply directly to Sourcery's review comments. - **Generate a GitHub issue from a review comment:** Ask Sourcery to create an issue from a review comment by replying to it. - **Generate a pull request title:** Write `@sourcery-ai` anywhere in the pull request title to generate a title at any time. - **Generate a pull request summary:** Write `@sourcery-ai summary` anywhere in the pull request body to generate a PR summary at any time. You can also use this command to specify where the summary should be inserted. #### Customizing Your Experience Access your [dashboard](https://app.sourcery.ai) to: - Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others. - Change the review language. - Add, remove or edit custom review instructions. - Adjust other review settings. #### Getting Help - [Contact our support team](mailto:support@sourcery.ai) for questions or feedback. - Visit our [documentation](https://docs.sourcery.ai) for detailed guides and information. - Keep in touch with the Sourcery team by following us on [X/Twitter](https://x.com/SourceryAI), [LinkedIn](https://www.linkedin.com/company/sourcery-ai/) or [GitHub](https://github.com/sourcery-ai).

coveralls commented 1 week ago

Pull Request Test Coverage Report for Build 11785936111

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 91.892%

Totals
Change from base Build 11770223146:	0.0%
Covered Lines:	1190
Relevant Lines:	1295

HuttleyLab / DiverseSeq