Closed pwh124 closed 1 year ago
This is slightly complex because of the way that readfish works. Signal is concatenated over time from each individual chunk from each read. So 303R means that 303 read starts were processed in 0.31916 seconds. Those read starts could consist of one or more chunks of read data depending on how many times the read has been seen before a decision has been made.
I hope that helps.
Ahhhh ok, I think I understand.
So if I am processing read 1 and read 2, the initial chunk of both of them would be shown as something like:
2R/0.01s
And then, lets say, a decision is made on read 1 with the initial chunk, but another chunk is needed for read 2. So read 2 would be reported as:
1R/0.02s
So the whole output would be:
2R/0.01s
1R/0.02s
Is that right?
No not quite.
The time measurement is the total time it has taken to process that number of read starts. But it isn't necessarily true that all read starts are just one chunk in length. Some may be longer than that if they have been seen before.
So you could have 10 reads that consisted of 2 chunks worth of data and then 25 reads that consisted of 1 chunks worth of data. Those 35 reads together took x seconds to process.
Is that clearer?
Oh ok, yes this makes sense now. Thanks!
Hello!
I am trying to make sense of some testing we are doing with different configurations of reference genomes for readfish.
I am taking the log files and processing them in R. The processing takes the log file, which looks like this:
And processes it to look something like this (note: not showing the same piece of the data as above):
I am sure there is still some processing to work out, but what is meant by
303R/0.31916s
? Is that number of full-length reads processed in that time or does the "303R" mean the number of read chunks processed?Thanks! Paul