short read data - Githubissues

refresh-bio / kmer-db

Kmer-db is a fast and memory-efficient tool for large-scale k-mer analyses (indexing, querying, estimating evolutionary relationships, etc.).

GNU General Public License v3.0

83 stars 17 forks source link

short read data #23

Open rmormando opened 10 months ago

rmormando commented 10 months ago

I have Illumina short-read sequences I want to use as an input will the tool take that into account when creating a database?

The samples are labeled asSample1_R1.fasta Sample1_R2.fasta would I put that all into the input list file so it reads like:

Sample1_R1
Sample1_R2
Sample2_R1
Sample2_R2
....
Sample20_R1
Sample20_R2

Would the tool recognize short read inputs? Or should I merge the two files so it only takes one input for that particular sample?

agudys commented 5 months ago

Hello!

Sorry for the delay in answer - I have somehow missed your issue! Every entry in the file list would be considered as a separate sample. So if you have two files for paired-end reads, you would need to concatenate them prior to the analysis, exactly as you said.

Best, Adam