Hello there,
congrats for this package, I loved it.
Some minor comments:
1) Update LICENSE, removing the placeholders:
<program> Copyright (C) <year> <name of author>
2) A slightly expanded documentation would be beneficial, in particular documenting the file formats (input and output files). The readme is fantastic to get a "worked" example but a reference documentation on a separate md file might be a useful addition.
Some examples:
what should be checked taxonomy_ambiguities.txt for (in the build subcommand)?
what is the format of the summary (the readme describe 3 columns, I found 4)
3) Please, add a CONTRIBUTING.md briefly defining how to contribute to the project, maybe adding a link to a code of conduct.
4) Installation is easy, but adding the package to BioConda would be very beneficial for the bioinformaticians planning to use the tool in pipelines. Is this planned for later?
5) From the statement of need it looks like that the (highly appreciated) flexibility provided by Sepia in terms of database creation could have been achieved with tools helping to format reference sequences in a Kraken-compatible format (ad hoc NCBI taxonomy), without reimplementing the whole thing (unless I'm mistaken here).
Under this light, it would be an added value for the reader to see a simple comparison of performance and sensitivity/specificity between Kraken2 and Sepia using a similar database.
Thank you for your helpful comments and speedy review!
Here is our response to your comments:
Some minor comments:
Update LICENSE, removing the placeholders:
• We updated the license and removed the placeholders
A slightly expanded documentation would be beneficial, in particular documenting the file formats (input and output files). The readme is fantastic to get a "worked" example but a reference documentation on a separate md file might be a useful addition.
• We are planning on making more expanded documentation including a section on how to build indices from (for example) the GTDB database. For now we addressed the examples that are mentioned
Some examples:
• what should be checked taxonomy_ambiguities.txt for (in the build subcommand)?
We included an example of what is in the taxonomy_ambiguities.txt file with a couple of real-life examples
• what is the format of the summary (the readme describe 3 columns, I found 4)
We included an description of the fourth column of the summary file
Please, add a CONTRIBUTING.md briefly defining how to contribute to the project, maybe adding a link to a code of conduct.
• We included the requested CONTRIBUTING.md
Installation is easy, but adding the package to BioConda would be very beneficial for the bioinformaticians planning to use the tool in pipelines. Is this planned for later?
•We are planning on adding the package to bioconda
From the statement of need it looks like that the (highly appreciated) flexibility provided by Sepia in terms of database creation could have been achieved with tools helping to format reference sequences in a Kraken-compatible format (ad hoc NCBI taxonomy), without reimplementing the whole thing (unless I'm mistaken here).
Under this light, it would be an added value for the reader to see a simple comparison of performance and sensitivity/specificity between Kraken2 and Sepia using a similar database.
•We added a few sentences summarizing the performance and sensitivity/specificity between Kraken2 and Sepia using a similar database.
https://github.com/openjournals/joss-reviews/issues/3839
Hello there, congrats for this package, I loved it.
Some minor comments:
1) Update LICENSE, removing the placeholders:
2) A slightly expanded documentation would be beneficial, in particular documenting the file formats (input and output files). The readme is fantastic to get a "worked" example but a reference documentation on a separate md file might be a useful addition. Some examples:
taxonomy_ambiguities.txt
for (in thebuild
subcommand)?3) Please, add a CONTRIBUTING.md briefly defining how to contribute to the project, maybe adding a link to a code of conduct.
4) Installation is easy, but adding the package to BioConda would be very beneficial for the bioinformaticians planning to use the tool in pipelines. Is this planned for later?
5) From the statement of need it looks like that the (highly appreciated) flexibility provided by Sepia in terms of database creation could have been achieved with tools helping to format reference sequences in a Kraken-compatible format (ad hoc NCBI taxonomy), without reimplementing the whole thing (unless I'm mistaken here). Under this light, it would be an added value for the reader to see a simple comparison of performance and sensitivity/specificity between Kraken2 and Sepia using a similar database.