Closed agalitsyna closed 3 years ago
Awesome, thanks a lot Sasha!
Two questions.
Do you think local cooler files should be copied from specified locations? I like the tidiness and reproducibility of that, but they were not copied before to avoid data duplication. I think Max previously wanted to run quaich on loads of cool files, and wanted to avoid having to copy them - which I understand, high resolution cooler files from deep sequenced experiments take up quite a lot of space. An option is to try hard-linking instead of copying, if the user prefers that. Soft linking doesn't play well with HDF5 for some reason.
Have you tried running it? You fixed some ugly whitespace I had to introduce, because for whatever reason snakemake wouldn't run for me with nice and tidy tabs. Does it run for you?
Good point, I've removed the opportunity to copy.
I don't run the pipeline fully since I don't need mustache and many other things. Basically, it runs without syntax errors. What error do you get?
To make it aligned with snakemake conventions, I've run the code through snakefmt. You can check it now and improve the lines that cause inconsistency in your system.
Namespace(anchor=None, baselist='inputs/beds/CTCF_test1.bed', bed2=None, bed2_ordered=True, by_window=False, coolfile='inputs/coolers/test.mcool::resolutions/1000', coverage_norm=False, excl_chrs='chrY,chrM', expected=None, ignore_diags=2, incl_chrs='all', local=False, logLevel='INFO', maxdist=None, maxshift=1000000, maxsize=None, mindist=None, minshift=100000, minsize=None, n_proc=5, nshifts=1, outdir='results/pileups/beds', outname='test1-1000_over_CTCF_test1.bed_1-shifts.np.txt', pad=100, post_mortem=False, rescale=False, rescale_pad=1.0, rescale_size=99, save_all=False, seed=None, subset=0, unbalanced=False, weight_name='weight') Traceback (most recent call last): File "/home/agalicina/anaconda3/envs/quaich/bin/coolpup.py", line 8, in
sys.exit(main()) File "/home/agalicina/anaconda3/envs/quaich/lib/python3.8/site-packages/coolpuppy/main.py", line 409, in main CC = CoordCreator( File "/home/agalicina/anaconda3/envs/quaich/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 353, in init self.process() File "/home/agalicina/anaconda3/envs/quaich/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 704, in process self.bases, self.kind = self.auto_read_bed(self.baselist) File "/home/agalicina/anaconda3/envs/quaich/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 389, in auto_read_bed int(row1[4]), ValueError: invalid literal for int() with base 10: '10.240026710177574'
Cool, I couldn't run snakefmt
because of this: https://github.com/snakemake/snakefmt/issues/89
Right, coolpuppy doesn't expect any columns apart from chrom, start end in a bed file, because it uses the number of columns to guess whether it's a bed or bedpe file, so it has to be strictly 3 or 6 columns. I guess our test file happened to have 6 columns, but they were not correct for bedpe...
Ah,force=True
only enforces the logging regime. If you remove this argument, it won't change the output of snakefmt...
Yeah I know, but I couldn't be bothered to change the source code, just let them fix it and then will install when the PR is merged.
Okay, maybe we will transition to Python 3.8 before they fix this single argument. Does the formatted version work with your snakemake environment?
I'll give it in a bit and will report, thanks!
It runs for me, thanks! Need to sort out the columns of the test CTCF file then...
I pushed some small changes to master that broke the merging. I think I fixed the conflicts though.
I think this is good to merge, any thoughts @agalitsyna?
I added another feature that I forgot to push earlier, but will just merge now. Thanks again!