Closed dougwyu closed 6 years ago
Hi, this is a public github repository so anyone can access the source code. To save you some time, here is the command line that we use to align illumina reads to the Nanopore reads in the NaS_wrapped source code file :
cat $OUTPUT_DIR/tmp/ILMN_reads.fa | parallel -j $NB_PROC --cat --pipe --block 10M --recstart ">" "$BLAT -tileSize=$TILE -stepSize=$STEP -noHead $NANO_READS {} $OUTPUT_DIR/tmp/psl/blat-alignment.job{#}.tile$TILE.step$STEP.psl" >$OUTPUT_DIR/tmp/blat-alignment.stderr
I think that the variable names are self-explanatory but if you need some more help, do not hesitate to ask.
Thank you very much @bistace.
My mistake was to download the zip file. All the NaS modules were in binary format.
I have now used Github Desktop to sync, and I can see the source code for most of the modules.
However, I cannot see the source code for extract_reads, which i think must refer to your code for extracting the reads that successfully map to each minION read?
Hi, the extract_reads binary file is a part of the compareads2 tool available here http://colibread.inria.fr/software/compareads.
This piece of code is no longer used in NaS and was used to retrieve the sequence of reads that shared similar k-mers after the execution of compareads.
You can use the following command lines to grab the name of Illumina reads that mapped to each MinION read :
mkdir output_dir
cat your_psl_file.psl | awk -v PFX=output_dir '{ file=PFX"/"$14".psl"; print $0>file; }'
This will create one file per MinION read with the name of Illumina reads that mapped to this particular read.
Thank you again @bistace
Hi! I would like to see the source code. Is that possible? In my situation, I really only need to see your code for aligning illumina reads to minION reads (using BLAT, if i've understood correctly). I'm trying use unassembled Illumina reads, individually sequenced from different species, to identify minION reads from mixed-species samples. The idea is to see which set of Illumina reads (each set = 1 species) maps "best" to each minION reads. "Best" is still to be defined, but presumably will be some combination of high coverage and low standard deviation of mapped Illumina reads across each minION read. Of course, some (large) proportion of minION reads won't be ID'd properly, but that should be acceptable.