code improvements and bug fixes

abearab commented 4 months ago

update docs details and multiple bug fixes in phenotype calculation
improve module organizations and rename plotting sub-modules to plotting / pl

abearab commented 4 months ago

pytest filed here https://github.com/ArcInstitute/ScreenPro2/actions/runs/10094083301/job/27911166033?pr=81.

@tshauck – I think this might be caused by changing from biobear 0.20.3 -> 0.22.*. Any thoughts?

Run pytest -s
============================= test session starts ==============================
platform linux -- Python 3.9.19, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/runner/work/ScreenPro2/ScreenPro2
collected 4 items

tests/test_fastq2count.py FF.
tests/test_phenoscore.py target    targetID_0  targetID_1  ...  targetID_8  targetID_9
sample                            ...                        
sample_0          23          18  ...          28          [10](https://github.com/ArcInstitute/ScreenPro2/actions/runs/10094083301/job/27911166033?pr=81#step:8:11)
sample_1          29           5  ...          12          24
sample_2           3          13  ...          20          12
sample_3          82          92  ...          90          69
sample_4          73          31  ...          57          78
sample_5          50          56  ...          75          24

[6 rows x 10 columns]
    B vs A
.

=================================== FAILURES ===================================
____________________________ test_cas9_single_guide ____________________________

    def test_cas9_single_guide():
>       df_count = ngs.cas9.fastq_to_count_single_guide(
            fastq_file_path='demo/step1_process_fastq_files/example_crispri_v2_sample.fastq.gz',
            trim5p_start=2,
            trim5p_length=19,
            verbose=True
        )

tests/test_fastq2count.py:6: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
screenpro/ngs/cas9.py:30: in fastq_to_count_single_guide
    df_count = session.sql(sql_cmd).to_polars()
pyarrow/table.pxi:4767: in pyarrow.lib.Table.from_batches
    ???
pyarrow/error.pxi:[15](https://github.com/ArcInstitute/ScreenPro2/actions/runs/10094083301/job/27911166033?pr=81#step:8:16)4: in pyarrow.lib.pyarrow_internal_check_status
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

  /usr/share/miniconda3/envs/test/lib/python3.9/site-packages/pandas/core/algorithms.py:5[22](https://github.com/ArcInstitute/ScreenPro2/actions/runs/10094083301/job/27911166033?pr=81#step:8:23): DeprecationWarning: np.find_common_type is deprecated.  Please use `np.result_type` or `np.promote_types`.
  See https://numpy.org/devdocs/release/1.[25](https://github.com/ArcInstitute/ScreenPro2/actions/runs/10094083301/job/27911166033?pr=81#step:8:26).0-notes.html and the docs for more information.  (Deprecated NumPy 1.25)
    common = np.find_common_type([values.dtype, comps_array.dtype], [])

tests/test_phenoscore.py::test_runPhenoScore
  /home/runner/work/ScreenPro2/ScreenPro2/screenpro/phenoscore/delta.py:45: RuntimeWarning: divide by zero encountered in log2
    return np.mean(np.log2(y) - np.log2(x), axis=1)

tests/test_phenoscore.py::test_runPhenoScore
  /usr/share/miniconda3/envs/test/lib/python3.9/site-packages/scipy/stats/_axis_nan_policy.py:551: RuntimeWarning: Precision loss occurred in moment calculation due to catastrophic cancellation. This occurs when the data are nearly identical. Results may be unreliable.
    res = hypotest_fun_out(*samples, axis=axis, **kwds)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_fastq2count.py::test_cas9_single_guide - pyarrow.lib.ArrowInvalid: Schema at index 0 was different: 
protospacer: string
count: int64 not null
vs
protospacer: string
count: int64
FAILED tests/test_fastq2count.py::test_cas9_dual_guide - pyarrow.lib.ArrowInvalid: Schema at index 0 was different: 
protospacer_a: string
protospacer_b: string
count: int64 not null
vs
protospacer_a: string
protospacer_b: string
count: int64
=================== 2 failed, 2 passed, 5 warnings in 3.[29](https://github.com/ArcInstitute/ScreenPro2/actions/runs/10094083301/job/27911166033?pr=81#step:8:30)s ====================
Error: Process completed with exit code 1.

tshauck commented 4 months ago

@abearab Thanks for mentioning this. I'll have a look and see if anything's changed w.r.t. count nullability and follow up.

abearab commented 4 months ago

@abearab Thanks for mentioning this. I'll have a look and see if anything's changed w.r.t. count nullability and follow up.

Sounds wonderful, thanks!

tshauck commented 3 months ago

Sure thing! I think I've identified the issue and pushed a fix. It's working its way to pypi now, and I'll follow up here once I've confirmed things look good.

tshauck commented 3 months ago

@abearab I think this should be fixed on 0.22.7, though please lmk if you have another other issues. Thanks again for the ping!

abearab commented 3 months ago

@abearab I think this should be fixed on 0.22.7, though please lmk if you have another other issues. Thanks again for the ping!

Hi @tshauck, thanks for looking into it! I'll give it a try in the next PR – #84

abearab commented 3 months ago

@tshauck, FYI – it's now working with the biobear 0.22.7

https://github.com/ArcInstitute/ScreenPro2/actions/runs/10131611251/job/28014544810

tshauck commented 3 months ago

@abearab Nice, thanks for letting me know!

ArcInstitute / ScreenPro2

code improvements and bug fixes #81