Closed MajoroMask closed 1 year ago
should I just use different 'db_name' between different tools? I'm not quite sure if it's reasonable, but for now it seems to be a proper workaround.
Yes, it is the proper workaround. We will need to think if we should enforce that or not and fail at input check if yes.
OK there is a vote to keep allowing the same database name.
Will need to then work out how to remove the bracken kraken2 output from ch_krona_text
input channel, as correctly identified @MajoroMask !
So the issue derives from here, so I guess need to reflect if this is actually would be a problem or not.
I think the general solution will be make a fake profiler name... and use 'else|or' statements in downstream places - currently those files are used in the profile standardisation and visualisation workflows
Lookinga t teh comment, indeed we made an faulty assumption about the database names :sweat_smile:
Description of the bug
I ran the test profile with a custom reference, where kraken2 and bracken having the same 'db_name'. The
--database
I'm using looks like below (notice I add '--quick' at 'db_params' for kraken2 so I can mark the difference.)Error message:
I think the problem is that in
subworkflows/local/visualization_krona.nf
,In my case, the channel
ch_krona_text
looks like below. It turns out that theprofile
channel passes kraken2 result from both main workflow and bracken into downstream.As contrast, the
profile
do tell the differences in file name between kraken2 called by the main workflow and by bracken. In this case theprofile
channel looks like this:If the same db_name assigned to kraken2 and bracken (which I think is reasonable since I built this two references on the same genome sequences and the same NCBI taxdump), channel
ch_krona_text_for_import
will have files with exactly same file name as input (in this case I two file from each sample, like two '2611_se.txt'), then cause the error.Command used and terminal output
No response
Relevant files
No response
System information
No response