Cell types found in the metadata file later translate to file names, so it feels safer to limit them to [0-9a-zA-Z_] to avoid surprises. I ran into someone giving me a metadata file with / present in the cell types, crashing everything out.
In the issue where we talked about running this on multiome, I mentioned that I started processing the BAMs for each sample separately in the case of multi-sample donors. I think it's likely to have multiple samples for a donor these days, and adding the option to prepend the sample ID to the BAM's original CB tag as you do the first step of SComatic is a good way to help with run time. Maybe someone will find it useful. It will make my pipeline cleaner as I won't have a separate file with a handful of changes present.
Two things stemming from practical use:
[0-9a-zA-Z_]
to avoid surprises. I ran into someone giving me a metadata file with/
present in the cell types, crashing everything out.CB
tag as you do the first step of SComatic is a good way to help with run time. Maybe someone will find it useful. It will make my pipeline cleaner as I won't have a separate file with a handful of changes present.