lemieuxl / pyGenClean

Automated genetic data clean up procedure in Python.
GNU General Public License v3.0
3 stars 1 forks source link

TypeError in remove_heterozygous_haploid, v1.8.1 #25

Closed jgrundstad closed 7 years ago

jgrundstad commented 7 years ago

Looks as if nb_hh_missing gets set to None, and not set to 0 if no genotypes were missing. This leads to an unhandled TypeError in the .format() statment under the next LaTeX summary, killing the process.

Log output

[2017-03-07 15:36:29 remove_heterozygous_haploid INFO] Options used:
[2017-03-07 15:36:29 remove_heterozygous_haploid INFO]   --bfile data_clean_up.phase2.ini/9_subset/subset
[2017-03-07 15:36:29 remove_heterozygous_haploid INFO]   --out data_clean_up.phase2.ini/10_remove_heterozygous_haploid/without_hh_genotypes
[2017-03-07 15:36:29 remove_heterozygous_haploid INFO] Running Plink to set heterozygous haploid as missing
Traceback (most recent call last):
  File "/home/grundaj/envs/pyGenClean/bin/run_pyGenClean", line 11, in <module>
    sys.exit(safe_main())
  File "/home/grundaj/envs/pyGenClean/lib/python2.7/site-packages/pyGenClean/run_data_clean_up.py", line 3585, in safe_main
    main()
  File "/home/grundaj/envs/pyGenClean/lib/python2.7/site-packages/pyGenClean/run_data_clean_up.py", line 202, in main
    options=options,
  File "/home/grundaj/envs/pyGenClean/lib/python2.7/site-packages/pyGenClean/run_data_clean_up.py", line 1880, in run_remove_heterozygous_haploid
    "s" if nb_hh_missing - 1 > 1 else "",
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

Code around line 1880

    # We get the number of genotypes that were set to missing
    nb_hh_missing = None
    with open(script_prefix + ".log", "r") as i_file:
        nb_hh_missing = re.search(
            r"(\d+) heterozygous haploid genotypes; set to missing",
            i_file.read(),
        )
    if nb_hh_missing:
        nb_hh_missing = int(nb_hh_missing.group(1))

    # We write a LaTeX summary
    latex_file = os.path.join(script_prefix + ".summary.tex")
    try:
        with open(latex_file, "w") as o_file:

            print >>o_file, latex_template.subsection(
                remove_heterozygous_haploid.pretty_name
            )
            text = (
                "After Plink's heterozygous haploid analysis, a total of "
                "{:,d} genotype{} were set to missing.".format(
                    nb_hh_missing,
                    "s" if nb_hh_missing - 1 > 1 else "",   #  <-- source of exception
                )
            )
            print >>o_file, latex_template.wrap_lines(text)

    except IOError:
        msg = "{}: cannot write LaTeX summary".format(latex_file)
        raise ProgramError(msg)

Thanks! Jason

lemieuxl commented 7 years ago

Take you for noticing this. I'll work on a fix right away.

lemieuxl commented 7 years ago

Has been fixed in version 1.8.3.

jgrundstad commented 7 years ago

Nice, thanks!

On Thu, Mar 9, 2017 at 9:24 AM, Louis-Philippe Lemieux Perreault < notifications@github.com> wrote:

Has been fixed in version 1.8.3.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lemieuxl/pyGenClean/issues/25#issuecomment-285381931, or mute the thread https://github.com/notifications/unsubscribe-auth/ABx3Br1VLoGRJo1f7DWA-m7JwymHyOgLks5rkBmygaJpZM4MW4kE .