Closed klmartinez closed 10 months ago
Hi,
If I understand it well, you just want to phase a single region, therefore there is not need to run GLIMPSE2_chunk. The reason why we run GLIMPSE2_chunk is to have uniform chunks across a chromosome (with overlapping buffers) and then ligate the phased data to produce phased data chromosome-wide.
However, as you are just trying to phase a 1Mb region, you can simply provide it to phase_common using the --region parameter. As it's such a small region, (43554 variants), likely you don't even need to run phase_rare (so you might want to use --filter-maf =0, allowing the program to consider all variants).
In the case you need to run phase_rare for computational reasons, you just need to specify the same region using the --input-region option.
Hope this helps
Simone
Hi Simone,
I had tried skipping the GLIMPSE2_chunk step and did phase_rare on its own, but I get prompted to add in the --scaffold-region flag and associated input.
I ran ./phase_rare_static --input mt_chr9_serology_subset_locus_srwgs_v7.bcf --scaffold target.scaffold.bcf --map genetic_map_hg38_chr9.txt --input-region chr9 --output target.phased.chunk.bcf --thread 16
And got the following error:
[SHAPEIT5] Phase_rare (phase rare variants onto a haplotype scaffold)
* Authors : Simone RUBINACCI & Olivier DELANEAU, University of Lausanne
* Contact : simone.rubinacci@unil.ch & olivier.delaneau@gmail.com
* Version : 5.1.1 / commit = 990ed0d / release = 2023-05-08
* Run date : 08/08/2023 - 22:47:38
ERROR: --scaffold missing
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<timed eval> in <module>
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2460 with self.builtin_trap:
2461 args = (magic_arg_s, cell)
-> 2462 result = fn(*args, **kwargs)
2463 return result
2464
/opt/conda/lib/python3.7/site-packages/IPython/core/magics/script.py in named_script_magic(line, cell)
140 else:
141 line = script
--> 142 return self.shebang(line, cell)
143
144 # write a basic docstring:
<decorator-gen-103> in shebang(self, line, cell)
/opt/conda/lib/python3.7/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
/opt/conda/lib/python3.7/site-packages/IPython/core/magics/script.py in shebang(self, line, cell)
243 sys.stderr.flush()
244 if args.raise_error and p.returncode!=0:
--> 245 raise CalledProcessError(p.returncode, cell, output=out, stderr=err)
246
247 def _run_script(self, p, cell, to_close):
CalledProcessError: Command 'b'./phase_rare_static --input mt_chr9_serology_subset_locus_srwgs_v7.bcf --map genetic_map_hg38_chr9.txt --input-region chr9 --output target.phased.chunk.bcf --thread 16\n'' returned non-zero exit status 1.
Hi,
Indeed, phase_rare
always requires a scaffold derived from phase_common
. Can you please try to run phase_common
instead with the --filter-maf =0
option?
If you have a scaffold (e.g. derived from SNP array data) and want to use it, you can add the --scaffold
option in phase_common too. It is unclear to me if this is your case.
Hopefully this works, but it might be slow. In that case, you might need to run phase_rare as well. But let's focus on getting things working first.
Simone
Things worked with phase_common
using the --filter-maf =0
option.
Great. Closing for now.
Hi,
I am trying to phase a 1Mb locus (I only need the phased genotypes for a handful of SNPs found in the middle of this locus). However, I see that phase_rare relies on reading in inputs from chunks.coordinates.txt. When using GLIMPSE2_chunk to create chunks.coordinates.txt, I receive text describing that the region is too small, and the resulting chunks.coordinates.txt is empty.
Is there a workaround for this?
Here is the text: