Closed kfuku52 closed 1 year ago
So I did some testing and technically amalgkit metadata
works without any config files at all. But I think search_term_species.config
and search_term_keyword.config
should be there, just so amalgkit metadata
doesn't try to get the whole SRA.
I'd raise warnings
for all missing files (saying that some metadata
functionalities may not work properly) and raise an actual error
for missing search_term_species.config
and search_term_keyword.config
.
What do you think? Is this too lenient?
There will be the cases where the two files are not necessary. For example, if you specify BioProject, the species name would be completely redundant and not necessary in many cases to specify. Your concern doesn't seem to be about the specific config files but about too many SRA entries to process. It might make more sense to raise a warning for >10k hit entries, for example.
On the other hand, sometimes I want large amounts of entries too. When I was looking for potential species to add to my analysis, I made a fairly open query to gather as much information as possible and ended up with 100k+ entries I could then narrow down manually/by parsing.
In that case we can leave to the user's digression and just raise the warnings? Would you prefer specific warnings for each of the files (i.e. what impact not having that file may have), or just have the warning refer to the wiki where we explain the config files?
In that case we can leave to the user's digression and just raise the warnings?
yes, let's do so.
or just have the warning refer to the wiki where we explain the config files?
I like this idea!
Currently,
amalgkit metadata
requires a complete set of config files. However, some config files are rarely used, soamalgkit
probably shouldn't raise an error but just print that it didn't detect a file that may be used as input. @Hego-CCTB Could you take care of it?