Open jnarayan81 opened 6 years ago
Hi,
Sorry for the late answer.
I can see you used "-perfect /media/urbe/ARCgenomic/toyGenome/toy.fasta". Is your "toy.fasta" file an actual genome? If so, this is most probably why you are getting this error.
When using the "-perfect" option, ELECTOR is assuming the input file represents "perfect / reference" reads, that are, reads without sequencing errors. If you wish to directly provide a genome to ELECTOR, and let it do the job of finding the perfect / reference reads itself, you should use the "-reference" option instead.
Please tell me if that helps!
Pierre
Hi @morispi
Thanks for reply.
I tried with -reference on example data
python3 elector.py -reference example/example_reference.fasta -corrected example/corrected_reads.fasta -uncorrected Sim -threads 46 -simulator simlord -output test2
but it seems busy doing something for last 12 hour ... which I think is not normal ! Is that command right ? Or did I missed something ?
Hey,
If you wish to use the SimLoRD simulated reads from the toy example along with the -reference switch, you should run this command, as indicated in the README:
python3 elector.py -reference example/example_reference.fasta -corrected example/Simlord/correctedReads.fasta -uncorrected example/Simlord/simulatedReads -simulator simlord
Indeed, the reads from the -corrected and from the -uncorrected switches have to have matching headers. It seems like what you provided you the -uncorrected switch (Sim) is an unexisting file, hence the program never stopping.
Quick use guide of the parameters:
For simulated reads:
-corrected LR.fasta: here you must provide a file of corrected long reads in fasta format -uncorrected SimLRPrefix: here you must provide the prefix of the simulated reads files. They must be the original reads for which the correction has been provided to the -corrected switch. For example, if you simulated a set of reads with SimLoRD in the test/simulation/ directory, this directory will contain the following files: simReads.h5, simReads.fastq, and simReads.fastq.sam. You must thus provide test/simulation/simReads (the common prefix of all the files) to the -uncorrected witch. -reference ref.fasta: here you must provide a file containing the reference genome in fasta format -simulator name: here you must provide the name of the simulator that was used to simulate the read (we currently only support SimLoRD and NanoSim) -corrector name: here you must provide the name of the correction method that was used to correct the reads (see list of supported correctors in the README)
For real reads:
-corrected LR.fasta: here you must provide a file of corrected long reads in fasta format -uncorrected rawReads.fasta: here you must provide a file of uncorrected long reads in fasta format. They must be the original reads for which the correction has been provided to the -corrected switch. -reference ref.fasta: here you must provide a file containing the reference genome in fasta format -corrector name: here you must provide the name of the correction method that was used to correct the reads (see list of supported correctors in the README)
Please tell me if that helps, or if it's clear enough. If you don't manage to run ELECTOR on the datasets you want, just describe me what you want to do, and I'll help providing you the command line.
Cheers, P
Some swag* files, seems missing !
I tried this as well, but ended with following error