Closed bschilder closed 2 years ago
I'm not sure on that approach. The order is defined by the mapping file, maybe a better option would be to warn the user if they have multiple values that match the mapped value, tell them which one will be taken and inform them to edit the mapping file if this is not the correct one? I don't think we can assume an order of priority here
I'm not sure on that approach. The order is defined by the mapping file, maybe a better option would be to warn the user if they have multiple values that match the mapped value, tell them which one will be taken and inform them to edit the mapping file if this is not the correct one? I don't think we can assume an order of priority here
True, but aren't we already kind of assuming a priority simply by ordering them (albeit arbitrarily) in the colmap file? Perhaps the best thing would be to just move up "FREQUENCY" in the colmap so that it's the first hit by default, but still provide the user a warning message about the ambiguity?
Yeah that sounds reasonable!
Updated order of FREQUENCY and MAF
I noticed that if the data has both the columns "frequency" and "MAF", the latter will be renamed to ""FRQ" while the former is just made uppercase.
I think this is because both are technically mapped to "FRQ", but it makes sense to prioritize using cols that are closer to the term "frequency" as the "FRQ" col. Only when none of these are available we can then go ahead and trying renaming MAF --> FRQ.
I can't recall, but perhaps this is already done within the full pipeline
format_sumstats
. So maybe I just need to update thestandardise_header
function? Not sure if that will screw anything up for the full pipeline.Reprex
Session info