cpouchon / REFMAKER

Make your own nuclear references from genomic assemblies of shotgun libraries.
4 stars 0 forks source link

mc.run_mcl error #4

Open jmhallas opened 11 months ago

jmhallas commented 11 months ago

Hello,

My run of refmaker seems to die creating the catalog_clean.fa file. The following is the error message.

Traceback (most recent call last):
  File "/share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-main/src/FiltMeta.py", line 313, in <module>
    result = mc.run_mcl(matrix, inflation=bestV)
  File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 189, in run_mcl
    assert inflation > 1, "Invalid inflation parameter"
AssertionError: Invalid inflation parameter
[E::fai_build3] Failed to open the FASTA file /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/catalog/clean_catalog.fa
Could not build fai index /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/catalog/clean_catalog.fa.fai
ls: cannot access '/share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/mapping/*rescaled.bam': No such file or directory
ls: cannot access '/share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/calling/*.bcf': No such file or directory
ls: cannot access '/share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/calling/filtered/*.bcf': No such file or directory

About:   Merge multiple VCF/BCF files from non-overlapping sample sets to create one multi-sample file.
         Note that only records from different files can be merged, never from the same file. For
         "vertical" merge take a look at "bcftools norm" instead.
Usage:   bcftools merge [options] <A.vcf.gz> <B.vcf.gz> [...]

Options:
        --force-samples                resolve duplicate sample names
        --print-header                 print only the merged header and exit
        --use-header <file>            use the provided header
    -0  --missing-to-ref               assume genotypes at missing sites are 0/0
    -f, --apply-filters <list>         require at least one of the listed FILTER strings (e.g. "PASS,.")
    -F, --filter-logic <x|+>           remove filters if some input is PASS ("x"), or apply all filters ("+") [+]
    -g, --gvcf <-|ref.fa>              merge gVCF blocks, INFO/END tag is expected. Implies -i QS:sum,MinDP:min,I16:sum,IDV:max,IMF:max
    -i, --info-rules <tag:method,..>   rules for merging INFO fields (method is one of sum,avg,min,max,join) or "-" to turn off the default [DP:sum,DP4:sum]
    -l, --file-list <file>             read file names from the file
    -m, --merge <string>               allow multiallelic records for <snps|indels|both|all|none|id>, see man page for details [both]
        --no-version                   do not append version and command line to the header
    -o, --output <file>                write output to a file [standard output]
    -O, --output-type <b|u|z|v>        'b' compressed BCF; 'u' uncompressed BCF; 'z' compressed VCF; 'v' uncompressed VCF [v]
    -r, --regions <region>             restrict to comma-separated list of regions
    -R, --regions-file <file>          restrict to regions listed in a file
        --threads <int>                number of extra output compression threads [0]

mkdir: cannot create directory '/share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/consense': File exists
Traceback (most recent call last):
  File "/share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-main/src/consfilter1.py", line 474, in <module>
    stats_ind[s]["f.missing"]=float(stats_ind[s]["c.missing"])/float(tot_loci)
ZeroDivisionError: float division by zero

I have looked at the FiltMeta.py script and it points me to line 313

result = mc.run_mcl(matrix, inflation=bestV)

In my catalog directory I have the following files:

all_clean_metassemblies.fa  blast_refall_refall_cleaned.out
all_clean_metassemblies.fa.ndb  blast_unclean_k31.out
all_clean_metassemblies.fa.nhr  blast_unclean_k51.out
all_clean_metassemblies.fa.nin  blast_unclean_k71.out
all_clean_metassemblies.fa.njs  blast_unclean_k91.out
all_clean_metassemblies.fa.not  metacontigs_cpdna.infos
all_clean_metassemblies.fa.nsq  metacontigs_mtdna.infos
all_clean_metassemblies.fa.ntf  metacontigs_others.infos
all_clean_metassemblies.fa.nto  metacontigs_rdna.infos

Any suggestions on how to correct the error and generate the clean_catalog.fa file? I'm really interested in getting your pipeline to work. It is perfect for my low coverage data.

Thanks,

-Josh

cpouchon commented 11 months ago

Hi Josh, Thank you for your comment. Can you send me your fasta file and your output from blast ? If you want you can send it to : contact@orthoskim.org I have to change the inflation values I think. Thank Cheers

cpouchon commented 11 months ago

Hi Josh, I have pushed a new version of src/FiltMeta.py function. Can you download and move it within your src/ folder ? I think it's ok now, you can rerun the filtering catalog mode. Let me know, Cheers Charles

jmhallas commented 10 months ago

Hi Charles,

The updated FiltMeta.py function worked perfectly. I encountered a new error during the Mapping step. Seems like a possible issue finding the reference used to create the dictionary. The clean_catalog.fa file was indexed with bwa but the dictionary for PICARD errored out. These are the files that are associated with clean_catalog.fa

clean_catalog.fa clean_catalog.fa.amb clean_catalog.fa.ann clean_catalog.fa.bwt clean_catalog.fa.fai clean_catalog.fa.pac clean_catalog.fa.sa

Here is the error:

[bwa_index] Pack FASTA... 0.07 sec [bwa_index] Construct BWT for the packed sequence... [bwa_index] 1.98 seconds elapse. [bwa_index] Update BWT... 0.06 sec [bwa_index] Pack forward-only FASTA... 0.05 sec [bwa_index] Construct SA from BWT and Occ... 0.61 sec [main] Version: 0.7.17-r1188 [main] CMD: /share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/bwa index /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/catalog/clean_catalog.fa [main] Real time: 8.017 sec; CPU: 2.776 sec ERROR: Invalid argument '-R'.

USAGE: CreateSequenceDictionary [options]

Documentation: http://broadinstitute.github.io/picard/command-line-overview.html#CreateSequenceDictionary

Creates a sequence dictionary for a reference sequence. This tool creates a sequence dictionary file (with ".dict" extension) from a reference sequence provided in FASTA format, which is required by many processing and analysis tools. The output file contains a header but no SAMRecords, and the header contains only sequence records.

The reference sequence can be gzipped (both .fasta and .fasta.gz are supported). Usage example:

java -jar picard.jar CreateSequenceDictionary \ R=reference.fasta \ O=reference.dict

Version: 2.18.29-SNAPSHOT

Options:

--help -h Displays options specific to this tool.

--stdhelp -H Displays options specific to this tool AND options common to all Picard command line tools.

--version Displays program version.

OUTPUT=File O=File Output SAM file containing only the sequence dictionary. By default it will use the base name of the input reference with the .dict extension Default value: null.

GENOME_ASSEMBLY=String AS=String Put into AS field of sequence dictionary entry if supplied Default value: null.

URI=String UR=String Put into UR field of sequence dictionary entry. If not supplied, input reference file is used Default value: null.

SPECIES=String SP=String Put into SP field of sequence dictionary entry Default value: null.

TRUNCATE_NAMES_AT_WHITESPACE=Boolean Make sequence name the first word from the > line in the fasta file. By default the entire contents of the > line is used, excluding leading and trailing whitespace. Default value: true. This option can be set to 'null' to clear the default value. Possible values: {true, false}

NUM_SEQUENCES=Integer Stop after writing this many sequences. For testing. Default value: 2147483647. This option can be set to 'null' to clear the default value.

ALT_NAMES=File AN=File Optional file containing the alternative names for the contigs. Tools may use this information to consider different contig notations as identical (e.g: 'chr1' and '1'). The alternative names will be put into the appropriate @AN annotation for each contig. No header. First column is the original name, the second column is an alternative name. One contig may have more than one alternative name. Default value: null.

REFERENCE=File R=File Input reference fasta or fasta.gz Required.

On Mon, Nov 6, 2023 at 5:09 AM Pouchon Charles @.***> wrote:

Hi Josh, I have pushed a new version of src/FiltMeta.py function. Can you download and move it within your src/ folder ? I think it's ok now, you can rerun the filtering catalog mode. Let me know, Cheers Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1794794951, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLRG3HYEJMWP6227S2LYDDORHAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJUG44TIOJVGE . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 10 months ago

Hi Josh, this is related to the version of picard that is installed within your environment. Can you try to activate your refmaker environment and then install this:

conda install "picard>=2.27"

Cheers Charles

jmhallas commented 10 months ago

Hi Charles,

I updated the picard like you suggested and it work. Thank you. I was able to get PICARD working but encountered another issue. After picard.sam.markduplicates.MarkDuplicates finishes, I get "unrecognized command 'coverage'". I am not familiar with this command. I double checked to make sure all my packages are up to date, and everything seems to be correct. Is 'coverage' a command in one of the supplied wrapper scripts?

[Mon Nov 13 14:46:29 PST 2023] picard.sam.AddOrReplaceReadGroups done. Elapsed time: 0.19 minutes. Runtime.totalMemory()=520617984 14:46:31.179 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/share/picard-2.27.5-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so [Mon Nov 13 14:46:31 PST 2023] MarkDuplicates --INPUT /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/mapping/temp_1_sorted_keep_rg.bam --OUTPUT /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/mapping/temp_1_sorted_keep_pcrdup.bam --METRICS_FILE /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/mapping/marks --REMOVE_DUPLICATES true --MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP 50000 --MAX_FILE_HANDLES_FOR_READ_ENDS_MAP 8000 --SORTING_COLLECTION_SIZE_RATIO 0.25 --TAG_DUPLICATE_SET_MEMBERS false --REMOVE_SEQUENCING_DUPLICATES false --TAGGING_POLICY DontTag --CLEAR_DT true --DUPLEX_UMI false --FLOW_MODE false --FLOW_QUALITY_SUM_STRATEGY false --USE_END_IN_UNPAIRED_READS false --USE_UNPAIRED_CLIPPED_END false --UNPAIRED_END_UNCERTAINTY 0 --FLOW_SKIP_FIRST_N_FLOWS 0 --FLOW_Q_IS_KNOWN_END false --FLOW_EFFECTIVE_QUALITY_THRESHOLD 15 --ADD_PG_TAG_TO_READS true --ASSUME_SORTED false --DUPLICATE_SCORING_STRATEGY SUM_OF_BASE_QUALITIES --PROGRAM_RECORD_ID MarkDuplicates --PROGRAM_GROUP_NAME MarkDuplicates --READ_NAME_REGEX <optimized capture of last three ':' separated fields as numeric values> --OPTICAL_DUPLICATE_PIXEL_DISTANCE 100 --MAX_OPTICAL_DUPLICATE_SET_SIZE 300000 --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false [Mon Nov 13 14:46:31 PST 2023] Executing as @.*** on Linux 4.15.0-142-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_332-b09; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: Version:2.27.5 INFO 2023-11-13 14:46:31 MarkDuplicates Start of doWork freeMemory: 500061792; totalMemory: 514850816; maxMemory: 1908932608 INFO 2023-11-13 14:46:31 MarkDuplicates Reading input file and constructing read end information. INFO 2023-11-13 14:46:31 MarkDuplicates Will retain up to 6916422 data points before spilling to disk. INFO 2023-11-13 14:46:34 MarkDuplicates Read 293713 records. 30648 pairs never matched. INFO 2023-11-13 14:46:34 MarkDuplicates After buildSortedReadEndLists freeMemory: 819705736; totalMemory: 919076864; maxMemory: 1908932608 INFO 2023-11-13 14:46:34 MarkDuplicates Will retain up to 59654144 duplicate indices before spilling to disk. INFO 2023-11-13 14:46:34 MarkDuplicates Traversing read pair information and detecting duplicates. INFO 2023-11-13 14:46:34 MarkDuplicates Traversing fragment information and detecting duplicates. INFO 2023-11-13 14:46:34 MarkDuplicates Sorting list of duplicate records. INFO 2023-11-13 14:46:35 MarkDuplicates After generateDuplicateIndexes freeMemory: 945782256; totalMemory: 1440219136; maxMemory: 1908932608 INFO 2023-11-13 14:46:35 MarkDuplicates Marking 53724 records as duplicates. INFO 2023-11-13 14:46:35 MarkDuplicates Found 0 optical duplicate clusters. INFO 2023-11-13 14:46:35 MarkDuplicates Reads are assumed to be ordered by: coordinate INFO 2023-11-13 14:46:41 MarkDuplicates Writing complete. Closing input iterator. INFO 2023-11-13 14:46:42 MarkDuplicates Duplicate Index cleanup. INFO 2023-11-13 14:46:42 MarkDuplicates Getting Memory Stats. INFO 2023-11-13 14:46:42 MarkDuplicates Before output close freeMemory: 1424130760; totalMemory: 1443889152; maxMemory: 1908932608 INFO 2023-11-13 14:46:42 MarkDuplicates Closed outputs. Getting more Memory Stats. INFO 2023-11-13 14:46:42 MarkDuplicates After output close freeMemory: 1423082184; totalMemory: 1442840576; maxMemory: 1908932608 [Mon Nov 13 14:46:42 PST 2023] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.19 minutes. Runtime.totalMemory()=1442840576

[main] unrecognized command 'coverage'

On Mon, Nov 13, 2023 at 1:30 AM Pouchon Charles @.***> wrote:

Hi Josh, this is related to the version of picard that is installed within your environment. Can you try to activate your refmaker environment and then install this:

conda install "picard>=2.27" Cheers Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1807761732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLVAUDRL5HFMBQGI2K3YEHSBZAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXG43DCNZTGI . You are receiving this because you authored the thread.Message ID: @.***>

jmhallas commented 10 months ago

Hi Charles,

I was just curious if you had any suggestions regarding the 'coverage' error I encountered?

thanks josh

On Mon, Nov 13, 2023, 3:21 PM Joshua Hallas @.***> wrote:

Hi Charles,

I updated the picard like you suggested and it work. Thank you. I was able to get PICARD working but encountered another issue. After picard.sam.markduplicates.MarkDuplicates finishes, I get "unrecognized command 'coverage'". I am not familiar with this command. I double checked to make sure all my packages are up to date, and everything seems to be correct. Is 'coverage' a command in one of the supplied wrapper scripts?

[Mon Nov 13 14:46:29 PST 2023] picard.sam.AddOrReplaceReadGroups done. Elapsed time: 0.19 minutes. Runtime.totalMemory()=520617984 14:46:31.179 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/share/picard-2.27.5-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so [Mon Nov 13 14:46:31 PST 2023] MarkDuplicates --INPUT /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/mapping/temp_1_sorted_keep_rg.bam --OUTPUT /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/mapping/temp_1_sorted_keep_pcrdup.bam --METRICS_FILE /share/cdfwwildlife/hallas_dedicated/lc_deer/refmarker/res_refmaker/mapping/marks --REMOVE_DUPLICATES true --MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP 50000 --MAX_FILE_HANDLES_FOR_READ_ENDS_MAP 8000 --SORTING_COLLECTION_SIZE_RATIO 0.25 --TAG_DUPLICATE_SET_MEMBERS false --REMOVE_SEQUENCING_DUPLICATES false --TAGGING_POLICY DontTag --CLEAR_DT true --DUPLEX_UMI false --FLOW_MODE false --FLOW_QUALITY_SUM_STRATEGY false --USE_END_IN_UNPAIRED_READS false --USE_UNPAIRED_CLIPPED_END false --UNPAIRED_END_UNCERTAINTY 0 --FLOW_SKIP_FIRST_N_FLOWS 0 --FLOW_Q_IS_KNOWN_END false --FLOW_EFFECTIVE_QUALITY_THRESHOLD 15 --ADD_PG_TAG_TO_READS true --ASSUME_SORTED false --DUPLICATE_SCORING_STRATEGY SUM_OF_BASE_QUALITIES --PROGRAM_RECORD_ID MarkDuplicates --PROGRAM_GROUP_NAME MarkDuplicates --READ_NAME_REGEX <optimized capture of last three ':' separated fields as numeric values> --OPTICAL_DUPLICATE_PIXEL_DISTANCE 100 --MAX_OPTICAL_DUPLICATE_SET_SIZE 300000 --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false [Mon Nov 13 14:46:31 PST 2023] Executing as @.*** on Linux 4.15.0-142-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_332-b09; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: Version:2.27.5 INFO 2023-11-13 14:46:31 MarkDuplicates Start of doWork freeMemory: 500061792; totalMemory: 514850816; maxMemory: 1908932608 INFO 2023-11-13 14:46:31 MarkDuplicates Reading input file and constructing read end information. INFO 2023-11-13 14:46:31 MarkDuplicates Will retain up to 6916422 data points before spilling to disk. INFO 2023-11-13 14:46:34 MarkDuplicates Read 293713 records. 30648 pairs never matched. INFO 2023-11-13 14:46:34 MarkDuplicates After buildSortedReadEndLists freeMemory: 819705736; totalMemory: 919076864; maxMemory: 1908932608 INFO 2023-11-13 14:46:34 MarkDuplicates Will retain up to 59654144 duplicate indices before spilling to disk. INFO 2023-11-13 14:46:34 MarkDuplicates Traversing read pair information and detecting duplicates. INFO 2023-11-13 14:46:34 MarkDuplicates Traversing fragment information and detecting duplicates. INFO 2023-11-13 14:46:34 MarkDuplicates Sorting list of duplicate records. INFO 2023-11-13 14:46:35 MarkDuplicates After generateDuplicateIndexes freeMemory: 945782256; totalMemory: 1440219136; maxMemory: 1908932608 INFO 2023-11-13 14:46:35 MarkDuplicates Marking 53724 records as duplicates. INFO 2023-11-13 14:46:35 MarkDuplicates Found 0 optical duplicate clusters. INFO 2023-11-13 14:46:35 MarkDuplicates Reads are assumed to be ordered by: coordinate INFO 2023-11-13 14:46:41 MarkDuplicates Writing complete. Closing input iterator. INFO 2023-11-13 14:46:42 MarkDuplicates Duplicate Index cleanup. INFO 2023-11-13 14:46:42 MarkDuplicates Getting Memory Stats. INFO 2023-11-13 14:46:42 MarkDuplicates Before output close freeMemory: 1424130760; totalMemory: 1443889152; maxMemory: 1908932608 INFO 2023-11-13 14:46:42 MarkDuplicates Closed outputs. Getting more Memory Stats. INFO 2023-11-13 14:46:42 MarkDuplicates After output close freeMemory: 1423082184; totalMemory: 1442840576; maxMemory: 1908932608 [Mon Nov 13 14:46:42 PST 2023] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.19 minutes. Runtime.totalMemory()=1442840576

[main] unrecognized command 'coverage'

On Mon, Nov 13, 2023 at 1:30 AM Pouchon Charles @.***> wrote:

Hi Josh, this is related to the version of picard that is installed within your environment. Can you try to activate your refmaker environment and then install this:

conda install "picard>=2.27" Cheers Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1807761732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLVAUDRL5HFMBQGI2K3YEHSBZAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXG43DCNZTGI . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 10 months ago

Hello Josh,

I think it is also a versioning issue but now with samtools.

Can you give me the version of your dependencies by running this within your refmaker-env :

conda list

Thank you,

Cheers Charles

jmhallas commented 10 months ago

I have version 1.6. The vignette says >=1.13

packages in environment at

/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env: #

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_kmp_llvm conda-forge _r-mutex 1.0.1 anacondar_1 conda-forge appdirs 1.4.4 pyhd3eb1b0_0 anaconda bcftools 1.9 h68d8f2e_9 bioconda binutils_impl_linux-64 2.38 h2a08ee3_1 binutils_linux-64 2.38.0 hc2dff05_0 biopython 1.81 py310h1fa729e_0 conda-forge blas 1.1 openblas conda-forge blast 2.14.1 pl5321h6f7f691_0 bioconda brotli 1.1.0 hd590300_0 conda-forge brotli-bin 1.1.0 hd590300_0 conda-forge brotlipy 0.7.0 py310h7f8727e_1002 anaconda bwa 0.7.17 he4a0461_11 bioconda bwidget 1.9.14 ha770c72_1 conda-forge bzip2 1.0.8 h7b6447c_0 anaconda c-ares 1.19.1 hd590300_0 conda-forge ca-certificates 2023.08.22 h06a4308_0 anaconda cairo 1.16.0 hb05425b_5 cd-hit 4.8.1 h43eeafb_9 bioconda certifi 2023.7.22 py310h06a4308_0 anaconda cffi 1.15.1 py310h5eee18b_3 anaconda charset-normalizer 2.0.4 pyhd3eb1b0_0 anaconda contourpy 1.0.5 py310hdb19cb5_0 cryptography 38.0.4 py310h9ce1e76_0 anaconda curl 7.88.1 hdc1c0ab_1 conda-forge cutadapt 4.4 py310h4b81fae_1 bioconda cycler 0.11.0 pyhd8ed1ab_0 conda-forge dbus 1.13.18 hb2f20db_0 dnaio 1.0.0 py310h4b81fae_0 bioconda entrez-direct 16.2 he881be0_1 bioconda ete3 3.1.3 pyhd8ed1ab_0 conda-forge expat 2.5.0 hcb278e6_1 conda-forge fastqc 0.12.1 hdfd78af_0 bioconda fftw 3.3.9 h27cfd23_1 anaconda font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge fontconfig 2.14.2 h14ed4e7_0 conda-forge fonttools 4.42.1 py310h2372a71_0 conda-forge freetype 2.12.1 hca18f0e_1 conda-forge fribidi 1.0.10 h36c2ea0_0 conda-forge gcc_impl_linux-64 11.2.0 h1234567_1 gcc_linux-64 11.2.0 h5c386dc_0 gettext 0.21.1 h27087fc_0 conda-forge gfortran_impl_linux-64 11.2.0 h1234567_1 gfortran_linux-64 11.2.0 hc2dff05_0 giflib 5.2.1 h0b41bf4_3 conda-forge glib 2.69.1 he621ea3_2 graphite2 1.3.14 h295c915_1 gsl 2.5 h294904e_1 conda-forge gst-plugins-base 1.14.1 h6a678d5_1 gstreamer 1.14.1 h5eee18b_1 gxx_impl_linux-64 11.2.0 h1234567_1 gxx_linux-64 11.2.0 hc2dff05_0 harfbuzz 4.3.0 hf52aaf7_1 htslib 1.9 h244ad75_9 bioconda icu 58.2 hf484d3e_1000 conda-forge idna 3.4 py310h06a4308_0 anaconda intel-openmp 2021.4.0 h06a4308_3561 anaconda isa-l 2.30.0 ha770c72_4 conda-forge joblib 1.2.0 py310h06a4308_0 anaconda jpeg 9e h0b41bf4_3 conda-forge kernel-headers_linux-64 2.6.32 he073ed8_16 conda-forge kiwisolver 1.4.4 py310h6a678d5_0 krb5 1.20.1 h143b758_1 lcms2 2.15 hfd0df8a_0 conda-forge ld_impl_linux-64 2.38 h1181459_1 anaconda lerc 3.0 h295c915_0 libblas 3.9.0 16_linux64_openblas conda-forge libbrotlicommon 1.1.0 hd590300_0 conda-forge libbrotlidec 1.1.0 hd590300_0 conda-forge libbrotlienc 1.1.0 hd590300_0 conda-forge libcblas 3.9.0 16_linux64_openblas conda-forge libclang 10.0.1 default_hb85057a_2 libcurl 7.88.1 hdc1c0ab_1 conda-forge libdeflate 1.17 h5eee18b_0 libedit 3.1.20221030 h5eee18b_0 libev 4.33 h516909a_1 conda-forge libevent 2.1.12 hf998b51_1 conda-forge libexpat 2.5.0 hcb278e6_1 conda-forge libffi 3.4.2 h6a678d5_6 anaconda libgcc-devel_linux-64 11.2.0 h1234567_1 libgcc-ng 13.2.0 h807b86a_3 conda-forge libgfortran-ng 11.2.0 h00389a5_1 anaconda libgfortran5 11.2.0 h1234567_1 anaconda libgomp 13.2.0 h807b86a_3 conda-forge libidn2 2.3.4 h166bdaf_0 conda-forge liblapack 3.9.0 16_linux64_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libnghttp2 1.52.0 h61bc06f_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge libpng 1.6.39 h753d276_0 conda-forge libpq 12.15 hdbd6064_1 libsqlite 3.43.0 h2797004_0 conda-forge libssh2 1.11.0 h0841786_0 conda-forge libstdcxx-devel_linux-64 11.2.0 h1234567_1 libstdcxx-ng 13.1.0 hfd8a6a1_0 conda-forge libtiff 4.5.1 h6a678d5_0 libunistring 0.9.10 h7f98852_0 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libwebp 1.2.4 h1daa5a0_1 conda-forge libwebp-base 1.2.4 h5eee18b_1 libxcb 1.15 h0b41bf4_0 conda-forge libxkbcommon 1.0.1 hfa300c1_0 libxml2 2.9.14 h74e7548_0 libxslt 1.1.35 h4e12654_0 libzlib 1.2.13 hd590300_5 conda-forge llvm-openmp 16.0.6 h4dfa4b3_0 conda-forge lxml 4.9.1 py310h1edc446_0 lz4-c 1.9.3 h9c3ff4c_1 conda-forge mafft 7.520 h031d066_2 bioconda make 4.3 hd18ef5c_1 conda-forge markov_clustering 0.0.6 py_0 bioconda matplotlib 3.7.1 py310h06a4308_1 matplotlib-base 3.7.1 py310h1128e8f_1 mkl 2021.4.0 h06a4308_640 anaconda mkl-service 2.4.0 py310h7f8727e_0 anaconda mkl_random 1.2.2 py310h00e6091_0 anaconda munkres 1.1.4 pyh9f0ad1d_0 conda-forge ncbi-vdb 3.0.7 hdbdd923_0 bioconda ncurses 6.4 h6a678d5_0 anaconda networkx 3.1 py310h06a4308_0 anaconda nspr 4.35 h6a678d5_0 nss 3.89.1 h6a678d5_0 numpy 1.26.0 py310hb13e2d6_0 conda-forge openblas 0.3.21 pthreads_h320a7e8_3 conda-forge openjdk 8.0.332 h166bdaf_0 conda-forge openssl 3.1.4 hd590300_0 conda-forge ossuuid 1.6.2 hf484d3e_1000 conda-forge packaging 22.0 py310h06a4308_0 anaconda pango 1.50.7 h05da053_0 pbzip2 1.1.13 0 conda-forge pcre 8.45 h9c3ff4c_0 conda-forge pcre2 10.37 hc3806b6_1 conda-forge perl 5.32.1 4_hd590300_perl5 conda-forge perl-alien-build 2.48 pl5321hec16e2b_0 bioconda perl-alien-libxml2 0.17 pl5321hec16e2b_0 bioconda perl-archive-tar 2.40 pl5321hdfd78af_0 bioconda perl-business-isbn 3.007 pl5321hdfd78af_0 bioconda perl-business-isbn-data 20210112.006 pl5321hdfd78af_0 bioconda perl-capture-tiny 0.48 pl5321hdfd78af_2 bioconda perl-carp 1.38 pl5321hdfd78af_4 bioconda perl-common-sense 3.75 pl5321hdfd78af_0 bioconda perl-compress-raw-bzip2 2.201 pl5321h87f3376_1 bioconda perl-compress-raw-zlib 2.105 pl5321h87f3376_0 bioconda perl-constant 1.33 pl5321hdfd78af_2 bioconda perl-data-dumper 2.183 pl5321hec16e2b_1 bioconda perl-encode 3.19 pl5321hec16e2b_1 bioconda perl-exporter 5.72 pl5321hdfd78af_2 bioconda perl-exporter-tiny 1.002002 pl5321hdfd78af_0 bioconda perl-extutils-makemaker 7.70 pl5321hd8ed1ab_0 conda-forge perl-ffi-checklib 0.28 pl5321hdfd78af_0 bioconda perl-file-chdir 0.1010 pl5321hdfd78af_3 bioconda perl-file-path 2.18 pl5321hd8ed1ab_0 conda-forge perl-file-temp 0.2304 pl5321hd8ed1ab_0 conda-forge perl-file-which 1.24 pl5321hd8ed1ab_0 conda-forge perl-importer 0.026 pl5321hdfd78af_0 bioconda perl-io-compress 2.201 pl5321hdbdd923_2 bioconda perl-io-zlib 1.14 pl5321hdfd78af_0 bioconda perl-json 4.10 pl5321hdfd78af_0 bioconda perl-json-xs 2.34 pl5321h4ac6f70_6 bioconda perl-list-moreutils 0.430 pl5321hdfd78af_0 bioconda perl-list-moreutils-xs 0.430 pl5321h031d066_2 bioconda perl-mime-base64 3.16 pl5321hec16e2b_2 bioconda perl-parent 0.236 pl5321hdfd78af_2 biocondaperl-path-tiny 0.122 pl5321hdfd78af_0 bioconda perl-pathtools 3.75 pl5321hec16e2b_3 bioconda perl-scalar-list-utils 1.62 pl5321hec16e2b_1 bioconda perl-scope-guard 0.21 pl5321hdfd78af_3 bioconda perl-sub-info 0.002 pl5321hdfd78af_1 bioconda perl-term-table 0.016 pl5321hdfd78af_0 bioconda perl-test2-suite 0.000145 pl5321hdfd78af_0 bioconda perl-types-serialiser 1.01 pl5321hdfd78af_0 bioconda perl-uri 5.12 pl5321hdfd78af_0 bioconda perl-xml-libxml 2.0207 pl5321h661654b_0 bioconda perl-xml-namespacesupport 1.12 pl5321hdfd78af_1 bioconda perl-xml-sax 1.02 pl5321hdfd78af_1 bioconda perl-xml-sax-base 1.09 pl5321hdfd78af_1 bioconda picard 2.27.5 hdfd78af_0 bioconda pigz 2.6 h27826a3_0 conda-forge pillow 9.4.0 py310h6a678d5_0 pip 22.3.1 py310h06a4308_0 anaconda pixman 0.42.2 h59595ed_0 conda-forge ply 3.11 py_1 conda-forge pooch 1.4.0 pyhd3eb1b0_0 anaconda pthread-stubs 0.4 h36c2ea0_1001 conda-forge pycparser 2.21 pyhd3eb1b0_0 anaconda pyopenssl 23.2.0 pyhd8ed1ab_1 conda-forge pyparsing 3.1.1 pyhd8ed1ab_0 conda-forge pyqt 5.15.7 py310h6a678d5_1 pyqt5-sip 12.11.0 pypi_0 pypi pysocks 1.7.1 py310h06a4308_0 anaconda python 3.10.12 hd12c33a_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python-isal 1.2.0 py310h2372a71_0 conda-forge python_abi 3.10 3_cp310 conda-forge qt-main 5.15.2 h327a75a_7 qt-webengine 5.15.9 hd2b0992_4 qtwebkit 5.212 h4eab89a_4 r-base 4.2.0 h1ae530e_0 readline 8.2 h5eee18b_0 anaconda requests 2.28.1 py310h06a4308_0 anaconda samtools 1.6 hc3601fc_10 bioconda scikit-learn 1.3.0 py310hf7d194e_0 conda-forge scipy 1.9.3 py310hdfbd76f_1 conda-forge setuptools 65.6.3 py310h06a4308_0 anaconda sip 6.6.2 py310h6a678d5_0 six 1.16.0 pyhd3eb1b0_1 anaconda spades 3.13.0 0 bioconda sqlite 3.43.0 h2c6b66d_0 conda-forge sysroot_linux-64 2.12 he073ed8_16 conda-forge threadpoolctl 3.2.0 pyha21a80b_0 conda-forge tk 8.6.12 h1ccaba5_0 anaconda tktable 2.10 h0c5db8f_4 conda-forge toml 0.10.2 pyhd8ed1ab_0 conda-forge tornado 6.3.3 py310h2372a71_0 conda-forge trimal 1.4.1 h4ac6f70_8 bioconda tzdata 2022a hda174b7_0 anaconda unicodedata2 15.0.0 py310h5764c6d_0 conda-forge urllib3 1.26.14 py310h06a4308_0 anaconda wget 1.20.3 ha35d2d1_1 conda-forge wheel 0.37.1 pyhd3eb1b0_0 anaconda xopen 1.7.0 py310hff52083_2 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.1.1 hd590300_0 conda-forge xorg-libsm 1.2.4 h7391055_0 conda-forge xorg-libx11 1.8.6 h8ee46fc_0 conda-forge xorg-libxau 1.0.11 hd590300_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h0b41bf4_2 conda-forge xorg-libxrender 0.9.11 hd590300_0 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.10 h5eee18b_1 anaconda zlib 1.2.13 hd590300_5 conda-forge zstandard 0.19.0 py310h1275a96_2 conda-forge zstd 1.5.2 h8a70e8d_1 conda-forge

On Mon, Nov 20, 2023 at 12:29 AM Pouchon Charles @.***> wrote:

Hello Josh,

I think it is also a versioning issue but now with samtools.

Can you give me the version of your dependencies by running this within your refmaker-env :

conda list

Thank you,

Cheers Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1818448133, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLUYILRET3SPGOL4NXLYFMIH5AVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJYGQ2DQMJTGM . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 10 months ago

This is probably related to this version 1.6.

Can you try to re-install both samtools and bcftools within the environment ?

conda install "samtools>=1.13" "bcftools>=1.13"

cheers,

Charles

jmhallas commented 10 months ago

Hi Charles,

I reran your script installing all the appropriate package versions. However, I got to the consensus step and got an error dealing with package versions.

Could not parse argument: --compression-level z /share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/scipy/init.py:155: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.0 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" Traceback (most recent call last): File "/share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-main/src/consfilter1.py", line 474, in

stats_ind[s]["f.missing"]=float(stats_ind[s]["c.missing"])/float(tot_loci) ZeroDivisionError: float division by zero

I double checked versions installed in the environment for SciPy 1.9.3, NumPy 1.26.0. I installed NumPy 1.25.2 thinking I need a version older than 1.26 and encountered the same "Could not parse argument: --compression-level z"

On Mon, Nov 20, 2023 at 11:54 AM Pouchon Charles @.***> wrote:

This is probably related to this version 1.6.

Can you try to re-install both samtools and bcftools within the environment ?

conda install "samtools>=1.13" "bcftools>=1.13"

cheers,

Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1819703640, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLVHYSFMGUAULJM3PODYFOYNTAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJZG4YDGNRUGA . You are receiving this because you authored the thread.Message ID: @.***>

jmhallas commented 10 months ago

Hi Charles,

I have rerun the pipeline with the updated packages, and I am still getting "Could not parse argument: --compression-level z".

Here are a few things that I have tried to troubleshoot

-Refmaker creates the calling directory and merge_filtered.bcf merge_filtered_snp.vcf. There are the correct number of individuals and 78,710 sites. -Refmaker creates consense directory -Refmaker then prints out [INFO]: consensus mode [INFO]: step 1. consensus Could not parse argument: --compression-level z

I am unable to figure out what "compression-level z" is referring to or what command is calling this option. I thought maybe I need the vcf file to be compressed but after compressing it I still received that same error.

Thank you for your time and advice troubleshooting these errors.

-josh

On Tue, Nov 28, 2023 at 1:36 PM Joshua Hallas @.***> wrote:

Hi Charles,

I reran your script installing all the appropriate package versions. However, I got to the consensus step and got an error dealing with package versions.

Could not parse argument: --compression-level z /share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/scipy/init.py:155: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.0 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" Traceback (most recent call last): File "/share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-main/src/consfilter1.py", line 474, in

stats_ind[s]["f.missing"]=float(stats_ind[s]["c.missing"])/float(tot_loci) ZeroDivisionError: float division by zero

I double checked versions installed in the environment for SciPy 1.9.3, NumPy 1.26.0. I installed NumPy 1.25.2 thinking I need a version older than 1.26 and encountered the same "Could not parse argument: --compression-level z"

On Mon, Nov 20, 2023 at 11:54 AM Pouchon Charles @.***> wrote:

This is probably related to this version 1.6.

Can you try to re-install both samtools and bcftools within the environment ?

conda install "samtools>=1.13" "bcftools>=1.13"

cheers,

Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1819703640, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLVHYSFMGUAULJM3PODYFOYNTAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJZG4YDGNRUGA . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 9 months ago

Hi Josh,

this is odd as your print mentioned that /share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-main/src/consfilter1.py" is used. And in the current version of refmaker, it's consfilter2 that is used.

To be sure, can you download the last version of refmaker ?

In addition, can you check that you have files that are not empty in your ${RES}/outfiles/ folder ?

Thank you,

Cheers,

Charles

cpouchon commented 9 months ago

And again, can you check for files in your ${RES}/trimming/ folder ?

Thanks,

C.

jmhallas commented 9 months ago

I don't have an outfiles directory. These are the directories I currently have....

assembly assembly_done.log assembly_error.log calling catalog consense mapping metassembly

I don't have a trimming directory. When I run -m consensus it creates the consensus directory and then errors out with "Could not parse argument: --compression-level z".

My original download of refmaker did have consfilter1.py. I downloaded a new version following your github

wget https://github.com/cpouchon/REFMAKER/archive/master.zip

These are the available scripts in /REFMAKER-main/src and it still has consfilter1.py

BlastParsing.py cons_fq_parser.py Fasta2Nex.py GeneStat.py
tmp_consfilter1.py cdhit_parser.py consparser1.py FiltContigs.py MetaN50.py concat_seq.py Cons_Parser_outgp.py FilterVCF.py RmCovOutliers.py consfilter1.py Cons_Parser.py FiltMeta.py SAMfiltering.py

I double checked the github and there is no consfilter2.py available.

On Mon, Dec 11, 2023 at 1:39 AM Pouchon Charles @.***> wrote:

And again, can you check for files in your ${RES}/trimming/ folder ?

Thanks,

C.

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1849661456, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLRUFKPORRQC7QCS54LYI3IETAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBZGY3DCNBVGY . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 9 months ago

Dear Josh,

I understand why it's not working (and all these issues regarding the packages versions). It's a mistake as the master folder doesn't match with the released package.

Please to get this version of refmaker:

wget https://github.com/cpouchon/REFMAKER/archive/refs/tags/v.0.0.zip

Let me know if it's ok.

Thank you,

Cheers,

Charles

jmhallas commented 9 months ago

Hi Charles,

I downloaded the new refmaker package and started over. I am getting an error again during the catalog filtering step. I had an issue with this step before when refmaker would run FiltMeta.py script and wasn't generating the clean_catalog.fa. You uploaded a new src/FiltMeta.py function and the step worked. I'm curious if this new issue is related to the problem I was having before.

These are all the files created in my catalog directory:

all_clean_metassemblies.fa all_clean_metassemblies.fa.ndb all_clean_metassemblies.fa.nhr all_clean_metassemblies.fa.nin all_clean_metassemblies.fa.njs all_clean_metassemblies.fa.not all_clean_metassemblies.fa.nsq all_clean_metassemblies.fa.ntf all_clean_metassemblies.fa.nto blast_refall_refall_cleaned.out blast_unclean_k31.out blast_unclean_k51.out blast_unclean_k71.out blast_unclean_k91.out metacontigs_cpdna.infos metacontigs_mtdna.infos metacontigs_others.infos metacontigs_rdna.infos

This is the error:

perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_US" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). /share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/scipy/sparse/_index.py:100: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_intXint(row, col, x.flat[0]) Traceback (most recent call last): File "/share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-v.0.0/src/FiltMeta.py", line 301, in result = mc.run_mcl(matrix, inflation=inflation) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 228, in run_mcl matrix = iterate(matrix, expansion, inflation) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 132, in iterate matrix = expand(matrix, expansion) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 53, in expand return np.linalg.matrix_power(matrix, power) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 635, in matrix_power _assert_stacked_2d(a) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 206, in _assert_stacked_2d raise LinAlgError('%d-dimensional array given. Array must be ' numpy.linalg.LinAlgError: 0-dimensional array given. Array must be at least two-dimensional

I feel like I'm really close to getting the pipeline to work. Thanks for all the help troubleshooting this.

-josh

On Tue, Dec 12, 2023 at 11:05 PM Pouchon Charles @.***> wrote:

Dear Josh,

I understand why it's not working (and all these issues regarding the packages versions). It's a mistake as the master folder doesn't match with the released package.

Please to get this version of refmaker:

wget https://github.com/cpouchon/REFMAKER/archive/refs/tags/v.0.0.zip

Let me know if it's ok.

Thank you,

Cheers,

Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1853372289, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLRECOVKCMFO3KJ2OK3YJFHTBAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJTGM3TEMRYHE . You are receiving this because you authored the thread.Message ID: @.***>

jmhallas commented 8 months ago

Hi Charles,

I was wondering if you had any suggestions concerning this issue? I tried using the FiltMeta.py script that previously worked and I got the following error.

Computing graph and adjacency matrix Clustering using MCL algorithm 1) selection of the best inflation value giving the highest modularity score /share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/scipy/sparse/_index.py:100: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_intXint(row, col, x.flat[0]) Traceback (most recent call last): File "/share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-v.0.0/src/FiltMeta.py", line 301, in result = mc.run_mcl(matrix, inflation=inflation) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 228, in run_mcl matrix = iterate(matrix, expansion, inflation) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 132, in iterate matrix = expand(matrix, expansion) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 53, in expand return np.linalg.matrix_power(matrix, power) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 635, in matrix_power _assert_stacked_2d(a) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 206, in _assert_stacked_2d raise LinAlgError('%d-dimensional array given. Array must be ' numpy.linalg.LinAlgError: 0-dimensional array given. Array must be at least two-dimensional

Here are notes from running the pipeline

DOWNLOAD

wget https://github.com/cpouchon/REFMAKER/archive/refs/tags/v.0.0.zip unzip v.0.0.zip rm v.0.0.zip cd ./REFMAKER-v.0.0

CREATE ENVS + LOAD DEPENDCIES

source /share/cdfwwildlife/hallas_dedicated/Miniconda/etc/profile.d/conda.sh conda create --name refmaker-env conda activate refmaker-env conda install bcftools">=1.13" biopython blast bwa cd-hit cutadapt ete3 fastqc joblib mafft markov_clustering matplotlib networkx numpy picard">=2.27" python samtools">=1.13" scipy spades trimal -y

EDIT TOOLS.SH

use which to find program locations. edit locations in tools.sh file.

which cutadapt fastqc spades.py makeblastdb blastn cd-hit-est bwa samtools bcftools picard trimal

These are all the packages in my environment.

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge _r-mutex 1.0.1 anacondar_1 conda-forge _sysroot_linux-64_curr_repodata_hack 3 h69a702a_13 conda-forge alsa-lib 1.2.7.2 h166bdaf_0 conda-forge appdirs 1.4.4 pyhd3eb1b0_0 anaconda attr 2.5.1 h166bdaf_1 conda-forge bcftools 1.17 h3cc50cf_1 bioconda binutils_impl_linux-64 2.38 h2a08ee3_1 binutils_linux-64 2.38.0 hc2dff05_0 biopython 1.81 py310h2372a71_1 conda-forge blas 1.1 openblas conda-forge blast 2.15.0 pl5321h6f7f691_1 bioconda brotli 1.1.0 hd590300_0 conda-forge brotli-bin 1.1.0 hd590300_0 conda-forge brotlipy 0.7.0 py310h7f8727e_1002 anaconda bwa 0.7.17 he4a0461_11 bioconda bwidget 1.9.14 ha770c72_1 conda-forge bzip2 1.0.8 h7b6447c_0 anaconda c-ares 1.19.1 hd590300_0 conda-forge ca-certificates 2023.11.17 hbcca054_0 conda-forge cairo 1.16.0 ha61ee94_1014 conda-forge cd-hit 4.8.1 h43eeafb_9 bioconda certifi 2023.11.17 pyhd8ed1ab_0 conda-forge cffi 1.15.1 py310h5eee18b_3 anaconda charset-normalizer 2.0.4 pyhd3eb1b0_0 anaconda contourpy 1.0.5 py310hdb19cb5_0 cryptography 38.0.4 py310h9ce1e76_0 anaconda curl 7.88.1 h5eee18b_0 cutadapt 4.6 py310h4b81fae_1 bioconda cycler 0.11.0 pyhd8ed1ab_0 conda-forge dbus 1.13.18 hb2f20db_0 dnaio 1.2.0 py310h4b81fae_0 bioconda entrez-direct 16.2 he881be0_1 bioconda ete3 3.1.3 pyhd8ed1ab_0 conda-forge expat 2.5.0 hcb278e6_1 conda-forge fastqc 0.12.1 hdfd78af_0 bioconda fftw 3.3.10 nompi_hc118613_108 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 h77eed37_1 conda-forge fontconfig 2.14.2 h14ed4e7_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge fonttools 4.42.1 py310h2372a71_0 conda-forge freetype 2.12.1 hca18f0e_1 conda-forge fribidi 1.0.10 h36c2ea0_0 conda-forge gawk 5.3.0 ha916aea_0 conda-forge gcc_impl_linux-64 11.2.0 h1234567_1 gcc_linux-64 11.2.0 h5c386dc_0 gettext 0.21.1 h27087fc_0 conda-forge gfortran_impl_linux-64 11.2.0 h7a446d4_16 conda-forge gfortran_linux-64 11.2.0 hc2dff05_0 giflib 5.2.1 h0b41bf4_3 conda-forge glib 2.74.1 h6239696_0 conda-forge glib-tools 2.74.1 h6239696_0 conda-forge gmp 6.3.0 h59595ed_0 conda-forge graphite2 1.3.14 h295c915_1 gsl 2.7 he838d99_0 conda-forge gst-plugins-base 1.20.3 h57caac4_2 conda-forge gstreamer 1.20.3 hd4edc92_2 conda-forge gxx_impl_linux-64 11.2.0 h1234567_1 gxx_linux-64 11.2.0 hc2dff05_0 harfbuzz 5.3.0 h418a68e_0 conda-forge htslib 1.17 h6bc39ce_1 bioconda icu 70.1 h27087fc_0 conda-forge idna 3.4 py310h06a4308_0 anaconda intel-openmp 2021.4.0 h06a4308_3561 anaconda isa-l 2.30.0 ha770c72_4 conda-forge jack 1.9.21 h2a1e645_0 conda-forge joblib 1.3.2 pyhd8ed1ab_0 conda-forge jpeg 9e h0b41bf4_3 conda-forge kernel-headers_linux-64 3.10.0 h4a8ded7_13 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge kiwisolver 1.4.4 py310h6a678d5_0 krb5 1.19.4 h568e23c_0 lame 3.100 h166bdaf_1003 conda-forge lcms2 2.14 h6ed2654_0 conda-forge ld_impl_linux-64 2.38 h1181459_1 anaconda lerc 4.0.0 h27087fc_0 conda-forge libblas 3.9.0 16_linux64_openblas conda-forge libbrotlicommon 1.1.0 hd590300_0 conda-forge libbrotlidec 1.1.0 hd590300_0 conda-forge libbrotlienc 1.1.0 hd590300_0 conda-forge libcap 2.66 ha37c62d_0 conda-forge libcblas 3.9.0 16_linux64_openblas conda-forge libclang 14.0.6 default_h7634d5b_1 conda-forge libclang13 14.0.6 default_h9986a30_1 conda-forge libcups 2.3.3 h3e49a29_2 conda-forge libcurl 7.88.1 h91b91d3_0 libdb 6.2.32 h9c3ff4c_0 conda-forge libdeflate 1.14 h166bdaf_0 conda-forge libedit 3.1.20221030 h5eee18b_0 libev 4.33 h516909a_1 conda-forge libevent 2.1.10 h9b69904_4 conda-forge libexpat 2.5.0 hcb278e6_1 conda-forge libffi 3.4.2 h6a678d5_6 anaconda libflac 1.4.3 h59595ed_0 conda-forge libgcc-devel_linux-64 11.2.0 h1234567_1 libgcc-ng 13.2.0 h807b86a_3 conda-forge libgfortran-ng 13.2.0 h69a702a_3 conda-forge libgfortran5 13.2.0 ha4646dd_3 conda-forge libglib 2.74.1 h7a41b64_0 conda-forge libgomp 13.2.0 h807b86a_3 conda-forge libiconv 1.17 hd590300_1 conda-forge libidn2 2.3.4 h166bdaf_0 conda-forge liblapack 3.9.0 16_linux64_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libllvm14 14.0.6 hcd5def8_4 conda-forge libnghttp2 1.51.0 hdcd2b5c_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libogg 1.3.4 h7f98852_1 conda-forge libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge libopus 1.3.1 h7f98852_1 conda-forge libpng 1.6.39 h753d276_0 conda-forge libpq 14.5 h72a31a5_3 conda-forge libsndfile 1.1.0 hcb278e6_1 conda-forge libsqlite 3.44.2 h2797004_0 conda-forge libssh2 1.10.0 haa6b8db_3 conda-forge libstdcxx-devel_linux-64 11.2.0 h1234567_1 libstdcxx-ng 13.1.0 hfd8a6a1_0 conda-forge libtiff 4.4.0 h82bc61c_5 conda-forge libtool 2.4.7 h27087fc_0 conda-forge libudev1 253 h0b41bf4_0 conda-forge libunistring 0.9.10 h7f98852_0 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libvorbis 1.3.7 h9c3ff4c_0 conda-forge libwebp 1.2.4 h522a892_0 conda-forge libwebp-base 1.2.4 h5eee18b_1 libxcb 1.13 h7f98852_1004 conda-forge libxkbcommon 1.0.3 he3ba5ed_0 conda-forge libxml2 2.9.14 h22db469_4 conda-forge libxslt 1.1.35 h8affb1d_0 conda-forge libzlib 1.2.13 hd590300_5 conda-forge llvm-openmp 8.0.1 hc9558a2_0 conda-forge lxml 4.9.1 py310h1edc446_0 lz4-c 1.9.3 h9c3ff4c_1 conda-forge mafft 7.520 h031d066_3 bioconda make 4.3 hd18ef5c_1 conda-forge markov_clustering 0.0.6 py_0 bioconda matplotlib 3.8.2 py310hff52083_0 conda-forge matplotlib-base 3.8.2 py310h62c0568_0 conda-forge mkl 2021.4.0 h06a4308_640 anaconda mkl-service 2.4.0 py310h7f8727e_0 anaconda mkl_random 1.2.2 py310h00e6091_0 anaconda mpfr 4.2.1 h9458935_0 conda-forge mpg123 1.31.3 hcb278e6_0 conda-forge munkres 1.1.4 pyh9f0ad1d_0 conda-forge mysql-common 8.0.32 h14678bc_0 conda-forge mysql-libs 8.0.32 h54cf53e_0 conda-forge ncbi-vdb 3.0.9 hdbdd923_0 bioconda ncurses 6.4 h6a678d5_0 anaconda networkx 3.2.1 pyhd8ed1ab_0 conda-forge nspr 4.35 h6a678d5_0 nss 3.89.1 h6a678d5_0 numpy 1.26.2 py310hb13e2d6_0 conda-forge openblas 0.3.21 pthreads_h320a7e8_3 conda-forge openjdk 17.0.3 hea3dc9f_3 conda-forge openmp 8.0.1 0 conda-forge openssl 1.1.1w hd590300_0 conda-forge ossuuid 1.6.2 hf484d3e_1000 conda-forge packaging 22.0 py310h06a4308_0 anaconda pango 1.50.12 h382ae3d_0 conda-forge pbzip2 1.1.13 0 conda-forge pcre 8.45 h9c3ff4c_0 conda-forge pcre2 10.37 hc3806b6_1 conda-forge perl 5.32.1 4_hd590300_perl5 conda-forge perl-alien-build 2.48 pl5321hec16e2b_0 bioconda perl-alien-libxml2 0.17 pl5321hec16e2b_0 bioconda perl-archive-tar 2.40 pl5321hdfd78af_0 bioconda perl-business-isbn 3.007 pl5321hdfd78af_0 bioconda perl-business-isbn-data 20210112.006 pl5321hdfd78af_0 bioconda perl-capture-tiny 0.48 pl5321hdfd78af_2 bioconda perl-carp 1.38 pl5321hdfd78af_4 bioconda perl-common-sense 3.75 pl5321hdfd78af_0 bioconda perl-compress-raw-bzip2 2.201 pl5321h87f3376_1 bioconda perl-compress-raw-zlib 2.105 pl5321h87f3376_0 bioconda perl-constant 1.33 pl5321hdfd78af_2 bioconda perl-data-dumper 2.183 pl5321hec16e2b_1 bioconda perl-encode 3.19 pl5321hec16e2b_1 bioconda perl-exporter 5.72 pl5321hdfd78af_2 bioconda perl-exporter-tiny 1.002002 pl5321hdfd78af_0 bioconda perl-extutils-makemaker 7.70 pl5321hd8ed1ab_0 conda-forge perl-ffi-checklib 0.28 pl5321hdfd78af_0 bioconda perl-file-chdir 0.1010 pl5321hdfd78af_3 bioconda perl-file-path 2.18 pl5321hd8ed1ab_0 conda-forge perl-file-temp 0.2304 pl5321hd8ed1ab_0 conda-forge perl-file-which 1.24 pl5321hd8ed1ab_0 conda-forge perl-importer 0.026 pl5321hdfd78af_0 bioconda perl-io-compress 2.201 pl5321hdbdd923_2 bioconda perl-io-zlib 1.14 pl5321hdfd78af_0 bioconda perl-json 4.10 pl5321hdfd78af_0 bioconda perl-json-xs 2.34 pl5321h4ac6f70_6 bioconda perl-list-moreutils 0.430 pl5321hdfd78af_0 bioconda perl-list-moreutils-xs 0.430 pl5321h031d066_2 bioconda perl-mime-base64 3.16 pl5321hec16e2b_2 bioconda perl-parent 0.236 pl5321hdfd78af_2 bioconda perl-path-tiny 0.122 pl5321hdfd78af_0 bioconda perl-pathtools 3.75 pl5321hec16e2b_3 bioconda perl-scalar-list-utils 1.62 pl5321hec16e2b_1 bioconda perl-scope-guard 0.21 pl5321hdfd78af_3 bioconda perl-sub-info 0.002 pl5321hdfd78af_1 bioconda perl-term-table 0.016 pl5321hdfd78af_0 bioconda perl-test2-suite 0.000145 pl5321hdfd78af_0 bioconda perl-types-serialiser 1.01 pl5321hdfd78af_0 bioconda perl-uri 5.12 pl5321hdfd78af_0 bioconda perl-xml-libxml 2.0207 pl5321h661654b_0 bioconda perl-xml-namespacesupport 1.12 pl5321hdfd78af_1 bioconda perl-xml-sax 1.02 pl5321hdfd78af_1 bioconda perl-xml-sax-base 1.09 pl5321hdfd78af_1 bioconda picard 3.1.1 hdfd78af_0 bioconda pigz 2.6 h27826a3_0 conda-forge pillow 9.4.0 py310h6a678d5_0 pip 22.3.1 py310h06a4308_0 anaconda pixman 0.42.2 h59595ed_0 conda-forge ply 3.11 py_1 conda-forge pooch 1.4.0 pyhd3eb1b0_0 anaconda pthread-stubs 0.4 h36c2ea0_1001 conda-forge pulseaudio 14.0 habe0971_10 conda-forge pycparser 2.21 pyhd3eb1b0_0 anaconda pyopenssl 23.2.0 pyhd8ed1ab_1 conda-forge pyparsing 3.1.1 pyhd8ed1ab_0 conda-forge pyqt 5.15.7 py310hab646b1_3 conda-forge pyqt5-sip 12.11.0 py310heca2aa9_3 conda-forge pysocks 1.7.1 py310h06a4308_0 anaconda python 3.10.8 h257c98d_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python-isal 1.2.0 py310h2372a71_0 conda-forge python_abi 3.10 3_cp310 conda-forge qt-main 5.15.6 hc525480_0 conda-forge qt-webengine 5.15.4 hcbadb6c_3 conda-forge qtwebkit 5.212 h3383a02_6 conda-forge r-base 4.2.1 h7880091_2 conda-forge readline 8.2 h5eee18b_0 anaconda requests 2.28.1 py310h06a4308_0 anaconda samtools 1.18 hd87286a_0 bioconda scikit-learn 1.3.0 py310hf7d194e_0 conda-forge scipy 1.11.4 py310hb13e2d6_0 conda-forge sed 4.8 he412f7d_0 conda-forge setuptools 65.6.3 py310h06a4308_0 anaconda sip 6.7.12 py310hc6cd4ac_0 conda-forge six 1.16.0 pyhd3eb1b0_1 anaconda spades 3.15.5 h95f258a_1 bioconda sqlite 3.44.2 h2c6b66d_0 conda-forge sysroot_linux-64 2.17 h4a8ded7_13 conda-forge threadpoolctl 3.2.0 pyha21a80b_0 conda-forge tk 8.6.13 noxft_h4845f30_101 conda-forge tktable 2.10 h0c5db8f_4 conda-forge toml 0.10.2 pyhd8ed1ab_0 conda-forge tomli 2.0.1 pyhd8ed1ab_0 conda-forge tornado 6.3.3 py310h2372a71_0 conda-forge trimal 1.4.1 h4ac6f70_8 bioconda tzdata 2022a hda174b7_0 anaconda unicodedata2 15.0.0 py310h5764c6d_0 conda-forge urllib3 1.26.14 py310h06a4308_0 anaconda wget 1.20.3 ha56f1ee_1 conda-forge wheel 0.37.1 pyhd3eb1b0_0 anaconda xcb-util 0.4.0 h516909a_0 conda-forge xcb-util-image 0.4.0 h166bdaf_0 conda-forge xcb-util-keysyms 0.4.0 h516909a_0 conda-forge xcb-util-renderutil 0.3.9 h166bdaf_0 conda-forge xcb-util-wm 0.4.1 h516909a_0 conda-forge xopen 1.7.0 py310hff52083_2 conda-forge xorg-fixesproto 5.0 h7f98852_1002 conda-forge xorg-inputproto 2.3.2 h7f98852_1002 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.0.10 h7f98852_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.8.4 h0b41bf4_0 conda-forge xorg-libxau 1.0.11 hd590300_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h0b41bf4_2 conda-forge xorg-libxfixes 5.0.3 h7f98852_1004 conda-forge xorg-libxi 1.7.10 h7f98852_0 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-libxt 1.3.0 hd590300_0 conda-forge xorg-libxtst 1.2.3 h7f98852_1002 conda-forge xorg-recordproto 1.14.2 h7f98852_1002 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.10 h5eee18b_1 anaconda zlib 1.2.13 hd590300_5 conda-forge zstandard 0.19.0 py310h1275a96_2 conda-forge zstd 1.5.2 h8a70e8d_1 conda-forge

This is my tools.sh

!/bin/bash

CUTADAPT=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/cutadapt FASTQC=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/fastqc SPADES=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/spades.py BLASTDB=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/makeblastdb BLASTN=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/blastn CDHIT=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/cd-hit-est BWA=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/bwa SAMTOOLS=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/samtools BCFTOOLS=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/bcftools PICARD=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/picard TRIMAL=/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/bin/trimal

I have tried everything I could think of. Thanks again for you help.

-josh

On Thu, Dec 28, 2023 at 11:26 AM Joshua Hallas @.***> wrote:

Hi Charles,

I downloaded the new refmaker package and started over. I am getting an error again during the catalog filtering step. I had an issue with this step before when refmaker would run FiltMeta.py script and wasn't generating the clean_catalog.fa. You uploaded a new src/FiltMeta.py function and the step worked. I'm curious if this new issue is related to the problem I was having before.

These are all the files created in my catalog directory:

all_clean_metassemblies.fa all_clean_metassemblies.fa.ndb all_clean_metassemblies.fa.nhr all_clean_metassemblies.fa.nin all_clean_metassemblies.fa.njs all_clean_metassemblies.fa.not all_clean_metassemblies.fa.nsq all_clean_metassemblies.fa.ntf all_clean_metassemblies.fa.nto blast_refall_refall_cleaned.out blast_unclean_k31.out blast_unclean_k51.out blast_unclean_k71.out blast_unclean_k91.out metacontigs_cpdna.infos metacontigs_mtdna.infos metacontigs_others.infos metacontigs_rdna.infos

This is the error:

perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_US" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). /share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/scipy/sparse/_index.py:100: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_intXint(row, col, x.flat[0]) Traceback (most recent call last): File "/share/cdfwwildlife/hallas_dedicated/programs/REFMAKER-v.0.0/src/FiltMeta.py", line 301, in result = mc.run_mcl(matrix, inflation=inflation) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 228, in run_mcl matrix = iterate(matrix, expansion, inflation) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 132, in iterate matrix = expand(matrix, expansion) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/markov_clustering/mcl.py", line 53, in expand return np.linalg.matrix_power(matrix, power) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 635, in matrix_power _assert_stacked_2d(a) File "/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 206, in _assert_stacked_2d raise LinAlgError('%d-dimensional array given. Array must be ' numpy.linalg.LinAlgError: 0-dimensional array given. Array must be at least two-dimensional

I feel like I'm really close to getting the pipeline to work. Thanks for all the help troubleshooting this.

-josh

On Tue, Dec 12, 2023 at 11:05 PM Pouchon Charles @.***> wrote:

Dear Josh,

I understand why it's not working (and all these issues regarding the packages versions). It's a mistake as the master folder doesn't match with the released package.

Please to get this version of refmaker:

wget https://github.com/cpouchon/REFMAKER/archive/refs/tags/v.0.0.zip

Let me know if it's ok.

Thank you,

Cheers,

Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1853372289, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLRECOVKCMFO3KJ2OK3YJFHTBAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJTGM3TEMRYHE . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 7 months ago

Dear Josh,

I apologize for the (very) late reply. I have made several changes to various scripts.

This issue should be fixed now.

Can you download the latest release of REFMAKER (v.1.0) and reinstall a conda environment as indicated on the page ?

Cheers,

Charles

jmhallas commented 7 months ago

Hi Charles,

Thanks for updating the program. I was reinstalling it, and I noticed cutadapt is not in the package install step but is in the tools.sh file. Do I need cutadapt?

On Fri, Feb 9, 2024 at 12:53 PM Pouchon Charles @.***> wrote:

Dear Josh,

I apologize for the (very) late reply. I have made several changes to various scripts.

This issue should be fixed now.

Can you download the latest release of REFMAKER (v.1.0) and reinstall a conda environment as indicated on the page ?

Cheers,

Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1936588118, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLR54CVJDOAWFK4KRALYS2EFDAVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZWGU4DQMJRHA . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 7 months ago

Hi, You don't need to use it. I have added it for further development. Thank you, Let me know if you have new issues. Cheers, Charles

jmhallas commented 7 months ago

Hi Charles,

I just ran the new script and encountered a new error I haven't before;

/share/cdfwwildlife/hallas_dedicated/Miniconda/envs/refmaker-env/share/spades/spades_pipeline/support.py:508: SyntaxWarning: invalid escape sequence '\d' return [atoi(c) for c in re.split("(\d+)", text)]

I have attached both the out file and error file. But the clean_catalog.fa file wasn't created. The catalog directory has unclean_catalog_k31.fa, unclean_catalog_k71.fa, unclean_catalog_k51.fa, and unclean_catalog_k91.fa

-josh

On Mon, Feb 12, 2024 at 1:18 PM Pouchon Charles @.***> wrote:

Hi, You don't need to use it. I have added it for further development. Thank you, Let me know if you have new issues. Cheers, Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1939600630, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLTDDUHWXQ47PLJKJK3YTKBK3AVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZZGYYDANRTGA . You are receiving this because you authored the thread.Message ID: @.***>

cpouchon commented 7 months ago

Hi Josh, I had the same issue this morning. I have uploaded the executable file REFMAKER-v.1.0/refmaker. Can you download it ? Thank you, Cheers, Charles

jmhallas commented 7 months ago

Hi Charles,

I got my final output files. Thanks for all your help working with me on running your program.

I had one last question about the number of loci retained. I was looking at my concatenated.log file (attached) and I'm trying to make sense of the number of loci removed.

After consensus I have: 5801

Filtering: 16/5801 loci removed according to the depth cutoff 388/5801 loci removed according to the minimal length 160/5801 loci removed according to the heterozygosity 187/5801 loci removed according to the population level thresholds remaining loci: 5063/5801 4810/5063 loci shared with a least one outgroup 4810/5063 loci shared with 1 outgroup taxa [INFOS]: computing final output fasta files [INFOS]: final matrix loci number: 107 samples: 77 length (bp): 94650

I'm not sure how I got 5063 remaining loci after filtering when 751 were removed. After removal it says I have 4810/5063 but then I have a final matrix it says I have 107. This there a filtering step that I am missing?

On Tue, Feb 13, 2024 at 12:14 PM Pouchon Charles @.***> wrote:

Hi Josh, I had the same issue this morning. I have uploaded the executable file REFMAKER-v.1.0/refmaker. Can you download it ? Thank you, Cheers, Charles

— Reply to this email directly, view it on GitHub https://github.com/cpouchon/REFMAKER/issues/4#issuecomment-1942371227, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQOUKLV5K3U6Y6A2COLL6P3YTPCQ3AVCNFSM6AAAAAA6WWUTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBSGM3TCMRSG4 . You are receiving this because you authored the thread.Message ID: @.***>