KChen-lab / Monopogen

SNV calling from single cell sequencing
GNU General Public License v3.0
84 stars 18 forks source link

Incorrect version of beagle when using conda install monopogen #69

Open hima-anbunathan-takara opened 4 months ago

hima-anbunathan-takara commented 4 months ago

Hi,

When installing using conda, this version of beagle is installed; beagle.27May24.118.jar. However, the source code (germline.py) indicates that it looks for these two versions of beagle - beagle.08Feb22.fa4.jar and beagle.27Jul16.86a.jar. Can this be changed to ensure that conda installs the version of beagle that monopogen uses?

Thanks, Hima

jinzhuangdou commented 4 months ago

Does it work to intall by specifying the beagle version? conda install -c bioconda beagle=27Jul16.86a If it does not work, you can manually add the JAR file of beagle.27Jul16.86a.jar to your conda environment.

hima-anbunathan-takara commented 4 months ago

Looks like this version of beagle is not available with conda install

PackagesNotFoundError: The following packages are not available from current channels:

MikeDMorgan commented 3 months ago

FYI - if one of the beagle versions is missing germline.py will error - currently only one (beagle.27Jul16.86a.jar) is included on the main branch.

yu-tong-wang commented 3 months ago

FYI - if one of the beagle versions is missing germline.py will error - currently only one (beagle.27Jul16.86a.jar) is included on the main branch.

Yes. When running Monopogen on Linux, I encountered an error related to the Beagle dependency. The error message indicates that the script is looking for beagle.08Feb22.fa4.jar, but this file is not present in the apps folder. Instead, beagle.27Jul16.86a.jar is available.

The script fails with the following error: CopyAssertionError: Program beagle.08Feb22.fa4.jar cannot be found!

Relevant Code: In src/germline.py, the check_dependencies function includes:

def check_dependencies(args):
    programs_to_check = ("vcftools", "bgzip",  "bcftools", "beagle.08Feb22.fa4.jar", "beagle.27Jul16.86a.jar","samtools","picard.jar", "java")
    for prog in programs_to_check:
        out = os.popen("command -v {}".format(args.app_path + "/" + prog)).read()
        assert out != "", "Program {} cannot be found!".format(prog)
swvanderlaan commented 3 months ago

I wrote this:

def check_dependencies(args):
    # NEW code -- 2024-08-15
    # these programs are installed via conda/mamba
    programs_to_check = ("vcftools", "bgzip",  "bcftools", "samtools", "java")

    for prog in programs_to_check:
        # NEW code -- 2024-08-15
        location_prog = subprocess.check_output(['which', prog]).strip().decode('utf-8')
        progr_out = os.popen("command -v {}".format(location_prog)).read()
        if args.debug:
            print(f"DEBUGGING: progr_out = {progr_out}")
        assert progr_out != "", "Program {} cannot be found!".format(prog)
    # NEW code -- 2024-08-15
    # these programs are downloaded via the Monopogen github repository
    jars_to_check = ("beagle.27Jul16.86a.jar", "picard.jar")
    for jar in jars_to_check:
        jar_path = os.path.join(args.app_path, jar)
        if args.debug:
            print(f"DEBUGGING: Checking JAR file at {jar_path}")
        assert os.path.isfile(jar_path), "Java jar file {} cannot be found at path {}!".format(jar, jar_path)

However, a few things: