armartin / ancestry_pipeline

Provides helper scripts for inferring local ancestry, performing ancestry-specific PCA, etc
102 stars 48 forks source link

What's the format of the bed file for plotting? #8

Open mocksu opened 5 years ago

mocksu commented 5 years ago

The manual has the following command to plot

IND='HG02481'; python plot_karyogram.py \ --bed_a ${IND}_A.bed \ --bed_b ${IND}_B.bed \ --ind ${IND} \ --out ${IND}.png

What's the format of the .bed file? Can you give an example? Does it look like the following?

21 100 200 EUR 21 200 800 AFR ...

One .bed file for hap1, and the other for hap2?

Thanks so much!

cwarden45 commented 5 years ago

I think this is what you are looking for: https://github.com/slowkoni/rfmix/issues/15#issuecomment-490323478

Yes, the input is two .bed files, but it looks different than you provided (and it doesn't strictly confirm to the .bed file format). I've copied over the relevant part of that issue thread here:

I figured out the "TRACTS-compatible" .bed format here:

https://github.com/sgravel/tracts

Also, I had to set "X" to "23" in the _centromereshg19.bed file, in order to get a .bed file in the following format to work:

1   1   249250621   ACB 0   293.397000284835
2   1   243199373   ACB 0   274.878930310443
3   1   198022430   ACB 0   227.853358434947
4   1   191154276   ACB 0   220.303824741903
5   1   180915260   ACB 0   208.956219767350
6   1   171115067   ACB 0   198.265021323850
7   1   159138663   ACB 0   190.600127463707
8   1   146364022   ACB 0   178.153765066457
9   1   141213431   ACB 0   180.473427901245
10  1   135534747   ACB 0   183.112946051943
11  1   135006516   ACB 0   161.856040732417
12  1   133851895   ACB 0   175.135319818982
13  1   115169878   ACB 0   129.792343512781
14  1   107349540   ACB 0   116.769337917254
15  1   102531392   ACB 0   150.840678460388
16  1   90354753    ACB 0   131.133808635308
17  1   81195210    ACB 0   128.534336575192
18  1   78077248    ACB 0   120.142569670562
19  1   59128983    ACB 0   106.857742811287
20  1   63025520    ACB 0   110.205396293913
21  1   48129895    ACB 0   64.651463886813
22  1   51304566    ACB 0   75.1198605516705
23  1   155270560   ACB 0   202.1689

In other words, as described in the tracts documentation, the file format should be as follows:

chrom   begin   end assignment  cmBegin      cmEnd

I hope this helps!

P.S. Either you or Alicia has to close the issue :)