Closed juanitagutierrez closed 5 years ago
Hello!
I was unable to reproduce the exact same error. I got only the "Error: Could not make wordmap from file K-mer_lists/..." lines if the input genome files were missing, faulty, empty or the paths in input file were incorrect.
It seems that in your case also for some reasons the "RD-1806-*.list" files are not correctly generated. I reccomend you to overcheck the input genome files (e.g. can't be .gz files) and also the paths in the input pheno file.
Also, could you tell me if the "RD-1806-*.list" files exist in the "K-mer_lists/" directory and if they do, what are their sizes and please try if the example script "example/test_PS_modelig.sh" works :)
You could also try out, if GenomeTester4 programs work, directly on your input genomes. For example: glistmaker "genome1..." -o genome1; glistmaker "genome1 genome2 genome3 ..." -o feature_vector; glistquery genome1_16.list -l feature_vector_13.list
The last command command is throwing the error in PhenotypeSeeker workflow, because genome1_16.list is not correctly generated as there is problem with the genome1 file or path.
Hello, and thanks for your help!
I am running it on files that resulted from joining forward and reverse reads from each sample using fastq-join from qiime2. They look like any regular fastq file and I have used them in other programs that also require fasta or fastq formats. Do you have any suggestion on the best strategy when starting with paired-end raw reads? The data.pheno file is ok (i.e. files' location is correct).
Indeed, the problem starts with the generation of the lists. It starts "Generating the k-mer lists for input samples", but I can't find them in the K-mer_lists folder. Only *mapped.txt files are created but they are empty.
I have tried running the example as you suggested, but it fails to find phenotypeseeker. The location of the cloned repository is specified in my bash profile.
Hi!
PhenotypeSeeker can actually take multiple fastq files per sample, if they are specified using wildcard. For example "sample 1 ~/sample1/*.fastq 0; sample2 ~/sample2/*.fastq 1". If you can give both paired-end fastqs with single wildcard address, I reccomend to try this approach. Also, please reinstall phenotypeseeker before trying this approach, I very recently fixed some bugs, which raised when using wildcards in addresses.
Hi there,
I am opening this issue, although a very similar one was solved before. It does not work for me yet, though. I am sorry to bring this problem back! I am just starting to run the modeling pipeline for PhenotpeSeeker, but I always get the following error messages:
Generating the k-mer lists for input samples: 10 of 10 lists generated. Generating the k-mer feature vector. Mapping samples to the feature vector space: gt4_wordmap_new: could not mmap file K-mer_lists/RD-1806-Ltenue-6-105-1_S1_L001_R1_001_13.list Error: Could not make wordmap from file K-mer_lists/RD-1806-Ltenue-6-105-1_S1_L001_R1_001_13.list! gt4_wordmap_new: could not mmap file K-mer_lists/RD-1806-Ltenue-8-34-3_S21_L007_R1_001_13.list Error: Could not make wordmap from file K-mer_lists/RD-1806-Ltenue-8-34-3_S21_L007_R1_001_13.list! gt4_wordmap_new: could not mmap file K-mer_lists/RD-1806-Ltenue-8-16-1_S22_L008_R1_001_13.list Error: Could not make wordmap from file K-mer_lists/RD-1806-Ltenue-8-16-1_S22_L008_R1_001_13.list! gt4_wordmap_new: could not mmap file K-mer_lists/RD-1806-8-50-1-thrum_S18_L006_R1_001_13.list Error: Could not make wordmap from file K-mer_lists/RD-1806-8-50-1-thrum_S18_L006_R1_001_13.list! gt4_wordmap_new: could not mmap file K-mer_lists/RD-1806-Ltenue-6-35-2_S24_L008_R1_001_13.list Error: Could not make wordmap from file K-mer_lists/RD-1806-Ltenue-6-35-2_S24_L008_R1_001_13.list! gt4_wordmap_new: could not mmap file K-mer_lists/RD-1806-8-8-2-Pin_S17_L006_R1_001_13.list Error: Could not make wordmap from file K-mer_lists/RD-1806-8-8-2-Pin_S17_L006_R1_001_13.list! gt4_wordmap_new: could not mmap file K-mer_lists/RD-1806-6-75-2-pin_S19_L007_R1_001_13.list Error: Could not make wordmap from file K-mer_lists/RD-1806-6-75-2-pin_S19_L007_R1_001_13.list!
Just a brief description of some previous steps I have run on my fastq files: I joined my forward and reverse samples using fastq-join, but it did not work. Then I read it could be a formatting issue, and so I converted them to fasta using seqtk seq from seqtk.
Any idea on what could be causing it? Thanks in advance!