Closed khoroshevskyi closed 1 year ago
This is really great! I think this is a great start and I think this is way better structure than what we had before. I wasn't able to review everything because I ran into some blockers that I couldn't get past, though. There's a lot we can do to clean this up and make it better. Here's my suggestions for you:
Docs feedback:
mkdocs
setup for this. All it requires is copying and editing a mkdocs yaml file and then I will be able to view the docs in a nice, linked way with mkdocs serve
. You already have the structure set up correctly.bedboss boss
it should be bedboss all
. Or should we introduce subparsers, bedboss run qc
bedboss run make
bedboss run all
?Code feedback:
args = parser.parse_args(sys.argv[1:2])
and creating separate argument parsers. This could become a maintenance challenge. I believe the accepted way to build an ArgumentParser with multiple commands like this is to use subparsers. This is what we have done in the past. See for example: https://github.com/pepkit/looper/blob/df535be2a0bd7257ee37148cb601d688ec557909/looper/__init__.py#L61config_db_local.yaml
is duplicated in pipeline_schemas/
and bedboss/
pipeline_schemas
folder, but this is confusing to me... Actually I can't really follow what the items in pipeline_schemas are for.used the
est_line.sh` shell script under the 'bedmaker' folder, which confused me.bedboss qc
uses --bedfile
and then bedboss make
uses --input-file
-- can we standardize these inputs somehow? --- OK: it uses bedfile, because bedqc can take only bed filesget_bed_type
is doing 2 tasks, these should be split into functions with appropriate names._get_chrom_sizes
doesn't tell the whole story. this function is also potentially creating something. It should be divided into two appropriately named functions.-n
the docs seem to indicate a boolean but then shouldn't it be a flag argument? What's the variable for?: "-n NARROWPEAK, --narrowpeak NARROWPEAK whether the regions are narrow (transcription factor implies narrow, histone mark implies broad peaks)" -t
, the docs say it's a bigwig file, but the arg is called INPUT_TYPE, and it's not clear what goes there? "-t INPUT_TYPE, --input-type INPUT_TYPE a bigwig or a bedgraph file that will be converted into BED format"@nsheff @xuebingjie1990 It seams that bedboss is ready for it's first release:) Could you please check it again, and see if something else should be added
Probably bedboss is not ready yet, but I would like to here your comments
1) Composed all scripts together 2) Added some corrections to the scripts
Things to finish: