Closed xuebingjie1990 closed 4 months ago
The result should be reported using pipestat
so that the final output statistics table shows which files are not passing QC.
It should also be possible to find out why they are not passing QC.
Right now bedqc will raise error when you QC is not pass: https://github.com/databio/bedboss/blob/948f5642ce8acc60e836481e984dd57fdb9a89e5/bedboss/bedqc/bedqc.py#L104
bedboss should create, or open csv file that will report why file didn't pass QC. I think it is already done: https://github.com/databio/bedboss/blob/948f5642ce8acc60e836481e984dd57fdb9a89e5/bedboss/bedqc/bedqc.py#L94-L102
Currently,
bedqc
flags bed files that either 1) are larger than 2G, 2) have over 5 million regions, or 3) have mean region width less than 10 bp. It requires a manual step to either remove the flagged files or keep them for the downstream process.Instead, we should improve the filters and add functions to deal with the flagged files so bedqc can be part of the automated process.