Open pfsulliv opened 5 years ago
bedtools-discuss post:
Feature request, capacity for bedtools to add and extract machine-readable metadata. Have searched this group for "metadata" and "comment" (only found a 2015 feature request, https://groups.google.com/d/msg/bedtools-discuss/TetrJYsJHX4/af5fEo1UAAAJ ). Forgive me if I've missed something.
My concern is error and reproducibility. We now include a lot of information in the bloody file name but in highly variable ways; I doubt anyone believes this is optimal. VCF is in a sense the opposite as usually have exhaustive headers that explicitly define pretty much everything in the file (some headers are 100s of lines). I am not arguing for full VCF approach,
However, what about defining and adding support in bedtools to add/read a reasonable but minimal set of header lines for bed files? Ideally, there would be a way to extract these from many files and to make them into a table (that could be put into a supplement). Comments appear to be possible in bed files (per UCSC), lines beginning with "#".
The above are just some ideas cobbled together from various sources.
All this open to mods of any sort.
Added here at Ryan Layer's request.
I put a post on bedtools-discuss (https://groups.google.com/forum/#!topic/bedtools-discuss/t6E74mCQb-E), suggesting that you support adding and retrieving headers in bed files. I am not suggesting the one adds something as extensive as in VCF but a handful of clearly defined “##” fields would be exceptionally useful (genome reference, organism, description, source - would seem to be key).