eastgenomics / eggd_generate_variant_workbook

DNAnexus app for generating xlsx variant workbooks
3 stars 0 forks source link

Handle additional files #114

Closed jethror1 closed 2 years ago

jethror1 commented 2 years ago

Summary

Adds ability to pass in additional non-VCF files to be added as extra sheets in the workbook, will handle compressed files and infers file delimeter when reading so can handle commas, space, tabs etc.

Adds 2 new inputs to the DNAnexus app

New functions

Test jobs

Tests

New tests:

$ python3 -m pytest -v resources/home/dnanexus/generate_workbook/tests/test_utils.py 
===================================================== test session starts =====================================================
platform linux -- Python 3.8.10, pytest-7.0.1, pluggy-1.0.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/jethro/Projects/eggd_vcf2xls_nirvana/resources/home/dnanexus/generate_workbook/tests, configfile: pytest.ini
collected 6 items                                                                                                             

resources/home/dnanexus/generate_workbook/tests/test_utils.py::test_is_numeric PASSED                                   [ 16%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_comma PASSED                [ 33%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_semicolon PASSED            [ 50%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_tab PASSED                  [ 66%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_space PASSED                [ 83%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_mixed PASSED                [100%]

====================================================== 6 passed in 0.02s ======================================================

All tests:

$ python3 -m pytest -v resources/home/dnanexus/generate_workbook/tests/
===================================================== test session starts =====================================================
platform linux -- Python 3.8.10, pytest-7.0.1, pluggy-1.0.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/jethro/Projects/eggd_vcf2xls_nirvana/resources/home/dnanexus/generate_workbook/tests, configfile: pytest.ini
collected 33 items                                                                                                            

resources/home/dnanexus/generate_workbook/tests/test_columns.py::test PASSED                                            [  3%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestMainColumns::test_chrom PASSED                     [  6%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestMainColumns::test_pos PASSED                       [  9%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestMainColumns::test_id PASSED                        [ 12%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestMainColumns::test_ref PASSED                       [ 15%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestMainColumns::test_alt PASSED                       [ 18%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestMainColumns::test_qual PASSED                      [ 21%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestMainColumns::test_filter PASSED                    [ 24%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestInfoColumn::test_parsed_correct_columns_from_info_records PASSED [ 27%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestInfoColumn::test_parsed_correct_gnomAD_AF_values PASSED [ 30%]
resources/home/dnanexus/generate_workbook/tests/test_columns.py::TestFormatSample::test_format_sample_values_are_correct PASSED [ 33%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py::TestModifyingFieldTypes::test_type_correctly_modified PASSED [ 36%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py::TestModifyingFieldTypes::test_header_overwritten_correctly PASSED [ 39%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py::TestFilters::test_filter_with_include_eq PASSED        [ 42%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py::TestFilters::test_filter_with_exclude_eq PASSED        [ 45%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py::TestFilters::test_filter_with_exclude_gt PASSED        [ 48%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py::TestFilters::test_combined_exclude_float_and_string PASSED [ 51%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py::TestFilters::test_combined_filter_and_recover PASSED   [ 54%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::test_is_numeric PASSED                                   [ 57%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_comma PASSED                [ 60%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_semicolon PASSED            [ 63%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_tab PASSED                  [ 66%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_space PASSED                [ 69%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py::TestDetermineDelimeter::test_mixed PASSED                [ 72%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestHeader::test_column_names PASSED                       [ 75%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestHeader::test_parse_reference PASSED                    [ 78%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestHeader::test_only_header_parsed PASSED                 [ 81%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestDataFrameActions::test_drop_columns_exclude PASSED     [ 84%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestDataFrameActions::test_drop_columns_include PASSED     [ 87%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestDataFrameActions::test_reorder_columns_correct_order PASSED [ 90%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestDataFrameActions::test_reorder_columns_no_dropped_columns PASSED [ 93%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestDataFrameActions::test_non_rename_columns_unaffacted PASSED [ 96%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py::TestDataFrameActions::test_renamed_correctly PASSED        [100%]

====================================================== warnings summary =======================================================
resources/home/dnanexus/generate_workbook/tests/test_columns.py:164
  /home/jethro/Projects/eggd_vcf2xls_nirvana/resources/home/dnanexus/generate_workbook/tests/test_columns.py:164: DeprecationWarning: invalid escape sequence \_
    f"cut -f8 {self.test_vcf} | grep -oh "

resources/home/dnanexus/generate_workbook/tests/test_columns.py:186
  /home/jethro/Projects/eggd_vcf2xls_nirvana/resources/home/dnanexus/generate_workbook/tests/test_columns.py:186: DeprecationWarning: invalid escape sequence \.
    f"grep -v '^#' {self.test_vcf} | grep -oh "

resources/home/dnanexus/generate_workbook/utils/vcf.py:747
  /home/jethro/Projects/eggd_vcf2xls_nirvana/resources/home/dnanexus/generate_workbook/utils/vcf.py:747: DeprecationWarning: invalid escape sequence \ 
    f"Column(s) specified with --rename already present in "

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================== 33 passed, 3 warnings in 29.91s ===============================================

This change is Reviewable

pep8speaks commented 2 years ago

Hello @jethror1! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 309:80: E501 line too long (89 > 79 characters)

Line 9:1: E402 module level import not at top of file Line 45:80: E501 line too long (84 > 79 characters) Line 53:80: E501 line too long (81 > 79 characters) Line 71:80: E501 line too long (80 > 79 characters) Line 75:80: E501 line too long (82 > 79 characters) Line 77:80: E501 line too long (80 > 79 characters) Line 81:80: E501 line too long (82 > 79 characters) Line 83:80: W292 no newline at end of file

Line 478:5: E303 too many blank lines (2)

Line 309:5: E303 too many blank lines (2)

Comment last updated at 2022-07-20 16:50:57 UTC