Closed tgjohnst closed 1 year ago
Greetings, apologies for missing this (for nearly an entire year)!
Thanks for identifying this! I added a blurb to the README, and if there are more substantive updates in the future I'll push an update to init_ref.sh and the docker container as well.
Thanks so much for making this workflow available and dockerizing it!
When testing it with a custom viral genome file (
-v
), I noticed that the workflow would run, but I saw a suspicious early[E::bwa_idx_load_from_disk] fail to locate the index files
message and the rest of the run would continue and eventually fail to find any integration sites.It turns out this was due to the viral fasta I was supplying not having been indexed with
bwa index
(init_ref.sh
indexes the joint reference but not the viral one alone) since it is used as the target of the initial mapping step (assumedly your included reference is already indexed). This is easy enough to do but took a while to figure out because there's no documentation suggesting that this file needs to be indexed in the README and I was trying to figure out if the joint indexing had failed.As far as solutions, I was thinking of either:
bwa index
(this wouldn't require any repackaging of the docker container)init_ref.sh
that also indexes the supplied viral reference fasta with samtools and bwa if the -v flag is specified. If you'd prefer this not be the default behavior, there could be an additional commandline flag to enable it, or a check for a matching bwa index file with appropriate suffix so it's not reindexed if those files already exist.Cheers! Tim J