Closed huzuner closed 3 years ago
Thanks @huzuner for reporting these issues. As for the conda installation, I guess it is not up to date. For now, I suggest cloning from Github and using the latest committed version. We will take a look at conda installation and update it as well.
As for the first issue you reported here, I will leave it to @cheny19 to comment on that.
Hi @huzuner , sorry for the late reply. I have tried both the pre-trained albacore model and the guppy model, and both of them produced sequences with the same lengths as the quality scores. In fact, the pre-trained models do not contain any information about the quality simulation. So theoretically the pre-trained models shouldn't affect the quality simulation. We had the unequal length bug before, but this was resolved before v3.0.0. Could you provide more information about your command, so I can try to reproduce this error?
As for the condo install, I have just updated the requirements.txt file so it should solve this problem. Please try the latest release and see how that goes. If not, you can also try to clone the Github repo and use conda install --file requirements.txt
for dependencies.
Hello,
I am using NanoSIm v3.0.0 and I would like to report a bug that is caused by the usage of "human_NA12878_DNA_FAB49712_albacore.tar.gz" that is found in the pre-trained_models file. When I simulated human reads using this model, there is something wrong with the aligned fastq files. When I run fastq-validator for one aligned_reads.fasq with:
biopet-validatefastq -i results/nanosim/hum/1_aligned_reads.fastq
I get the following error and it is not possible to do further processing with this fastq. For example, Sourmash always throws the same error with fastq validator when I try to compute signatures.
On the other hand, when I use "human_NA12878_DNA_FAB49712_guppy.tar.gz", fastq validator and sourmash do not throw any errors and I have no problem.
It could be the case that the albacore model needs to be re-trained.
In addition, conda installation of NanoSim is also problematic. When I install it via the biconda channel, the simulator.py throws an error at the point "Read KDF of unaligned reads" after the script starts to run. My commands are:
simulator.py genome -rg results/refs/hs_genome.fasta -c resources/human_NA12878_DNA_FAB49712_albacore/training -b albacore --num_threads 2 --fastq -o test.fastq -n 10000
And the error:
I think this is related to an issue that I mentioned before in a previous issue.
Thank you, Hamdiye