HuttleyLab / DiverseSeq

Tools for analysis of sequence divergence
BSD 3-Clause "New" or "Revised" License
3 stars 3 forks source link

Linted paper scripts and notebooks #81

Closed GavinHuttley closed 1 week ago

GavinHuttley commented 1 week ago

Summary by Sourcery

Lint and refactor paper scripts and notebooks to enhance code quality and maintainability, and add comprehensive documentation for the 'diverse-seq' application.

Enhancements:

Documentation:

sourcery-ai[bot] commented 1 week ago

Reviewer's Guide by Sourcery

This PR adds Python scripts and Jupyter notebooks for generating figures and analyzing results for the paper. The changes include scripts for benchmarking performance, evaluating clustering algorithms, and generating visualizations of the results.

Class diagram for new classes in jsd_v_dist.py

classDiagram
    class min_dist {
        -dists
        -num
        +__init__(dists)
        +__call__(names: list[str]) -> float
    }

    class compare_sets {
        -dist_size: int
        -dvgt
        +__init__(app, dist_size: int)
        +main(aln: c3_types.AlignedSeqsType) -> dict
    }

    class make_sample {
        -pool_sizes: dict
        -seq_len: int
        +main(num: int) -> c3_types.UnalignedSeqsType
    }

    class seqcoll_to_records {
        -k: int
        +main(seqs: c3_types.UnalignedSeqsType) -> list[KmerSeq]
    }

    class true_positive {
        -expected: set[str]
        -label2pool: callable
        -size: int
        -stat: str
        +main(records: list[KmerSeq]) -> bool
    }

Class diagram for new classes in synthetic_known.py

classDiagram
    class make_sample {
        -pool_sizes: dict
        -seq_len: int
        +main(num: int) -> c3_types.UnalignedSeqsType
    }

    class seqcoll_to_records {
        -k: int
        +main(seqs: c3_types.UnalignedSeqsType) -> list[KmerSeq]
    }

    class true_positive {
        -expected: set[str]
        -label2pool: callable
        -size: int
        -stat: str
        +main(records: list[KmerSeq]) -> bool
    }

    class eval_condition {
        -num_reps
        -k
        -repeats
        -pools
        +main(settings: tuple[str, str, int]) -> c3_types.TabularType
    }

File-Level Changes

Change Details Files
Added scripts for benchmarking and performance evaluation
  • Added benchmark.py for measuring execution time of different commands
  • Added benchmark_ctree.py specifically for evaluating clustering tree performance
  • Implemented TimeIt and TempWorkingDir utility classes for benchmarking
paper/nbks/benchmark.py
paper/nbks/benchmark_ctree.py
Added scripts and notebooks for analyzing and visualizing results
  • Added Jupyter notebooks for analyzing JSD vs distance metrics
  • Added notebook for demonstrating plugin usage
  • Added notebook for analyzing benchmark results
  • Added notebook for analyzing clustering tree results
paper/nbks/jsd_v_dist.ipynb
paper/nbks/plugin_demo.ipynb
paper/nbks/benchmark.ipynb
paper/nbks/ctree.ipynb
Added supporting infrastructure for experiments
  • Added script for downloading required datasets
  • Added project path utilities for managing file locations
  • Added LaTeX files for generating algorithm figures
  • Created directory structure for results and figures
paper/nbks/get_data_sets.py
paper/nbks/project_path.py
paper/figs/max.tex
paper/figs/nmost.tex
Added scripts for clustering tree experiments
  • Added experiment.py for running clustering tree analysis
  • Added iq_experiment.py for IQ-TREE comparisons
  • Added likelihoods.py for computing tree likelihoods
paper/nbks/ctree/experiment.py
paper/nbks/ctree/iq_experiment.py
paper/nbks/ctree/likelihoods.py

Tips and commands #### Interacting with Sourcery - **Trigger a new review:** Comment `@sourcery-ai review` on the pull request. - **Continue discussions:** Reply directly to Sourcery's review comments. - **Generate a GitHub issue from a review comment:** Ask Sourcery to create an issue from a review comment by replying to it. - **Generate a pull request title:** Write `@sourcery-ai` anywhere in the pull request title to generate a title at any time. - **Generate a pull request summary:** Write `@sourcery-ai summary` anywhere in the pull request body to generate a PR summary at any time. You can also use this command to specify where the summary should be inserted. #### Customizing Your Experience Access your [dashboard](https://app.sourcery.ai) to: - Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others. - Change the review language. - Add, remove or edit custom review instructions. - Adjust other review settings. #### Getting Help - [Contact our support team](mailto:support@sourcery.ai) for questions or feedback. - Visit our [documentation](https://docs.sourcery.ai) for detailed guides and information. - Keep in touch with the Sourcery team by following us on [X/Twitter](https://x.com/SourceryAI), [LinkedIn](https://www.linkedin.com/company/sourcery-ai/) or [GitHub](https://github.com/sourcery-ai).
coveralls commented 1 week ago

Pull Request Test Coverage Report for Build 11785936111

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details


Totals Coverage Status
Change from base Build 11770223146: 0.0%
Covered Lines: 1190
Relevant Lines: 1295

💛 - Coveralls