lh3 / hickit

TAD calling, phase imputation, 3D modeling and more for diploid single-cell Hi-C (Dip-C) and general Hi-C
108 stars 11 forks source link

[E::hk_sd_ploidy_XY] multiple chr contain 'X' or 'Y' in names #39

Closed zcq23 closed 3 months ago

zcq23 commented 3 months ago

Hi, I met a problem that "$ hickit -i ${OUTPUT}/structure/SRR25505058.impute.pairs_with_chromsize.gz -Sr1m -c1 -o out.pairs [M::hk_map_read] read 694017 pairs [E::hk_sd_ploidy_XY] multiple chr contain 'X' or 'Y' in names"

my hickit's version is r291

head -n 50 SRR25505058.impute.pairs_with_chromsize.gz: $ zcat filtered_pairs_with_chromsize.gz | head -n 50

pairs format v1.0

sorted: chr1-chr2-pos1-pos2

shape: upper triangle

chromsize: 1 197195432

chromsize: 2 181748087

chromsize: 3 159599783

chromsize: 4 155630120

chromsize: 5 152537259

chromsize: 6 149517037

chromsize: 7 152524553

chromsize: 8 131738871

chromsize: 9 124076172

chromsize: 10 129993255

chromsize: 11 121843856

chromsize: 12 121257530

chromsize: 13 120284312

chromsize: 14 125194864

chromsize: 15 103494974

chromsize: 16 98319150

chromsize: 17 95272651

chromsize: 18 90772031

chromsize: 19 61342430

chromsize: X 166650296

columns: readID chr1 pos1 chr2 pos2 strand1 strand2 phase0 phase1 phase_prob00 phase_prob01 phase_prob10 phase_prob11

. chr1 3003625 chr1 7400077 + + . . 0.836 0.008 0.002 0.154 . chr1 3009032 chr1 7403384 + + . . 0.839 0.008 0.002 0.151 . chr1 3012303 chr1 4114735 + + 1 1 0.000 0.000 0.000 1.000 . chr1 3018787 chr1 61903331 + + 1 . 0.000 0.000 0.004 0.996

tail -n 50 SRR25505058.impute.pairs_with_chromsize.gz: . chrX 169912765 chrX 169915788 + + 1 1 0.000 0.000 0.000 1.000 . chrX 170813085 chrX 170857943 + + 1 1 0.000 0.000 0.000 1.000 . chrX 31780402 chrY 4021769 + + 1 0 0.000 0.000 1.000 0.000 . chrX 100513028 chrY 9488953 + + 1 0 0.000 0.000 1.000 0.000 . chrX 120299636 chrY 2161589 + + 1 0 0.000 0.000 1.000 0.000 . chrX 142046394 chrY 2161622 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483022 chrY 4156800 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483028 chrY 4150242 + + 1 0 0.000 0.000 1.000 0.000 . chrY 142402 chrY 262227 + + 0 0 1.000 0.000 0.000 0.000 . chrY 258185 chrY 259206 + + 0 0 1.000 0.000 0.000 0.000 . chrY 259786 chrY 22211175 + + 0 0 1.000 0.000 0.000 0.000 . chrY 671941 chrY 768654 + + 0 0 1.000 0.000 0.000 0.000 . chrY 700407 chrY 737511 + + 0 0 1.000 0.000 0.000 0.000 . chrY 728659 chrY 89651070 + + 0 0 1.000 0.000 0.000 0.000 . chrY 732343 chrY 765350 + + 0 0 1.000 0.000 0.000 0.000 . chrY 764219 chrY 767145 + + 0 0 1.000 0.000 0.000 0.000

tanlongzhi commented 3 months ago

Hi, you have a mismatch in chromosome names in your header (1, 2, 3 …) compared to the actual contacts (chr1, chr2, chr3 …). You must match them exactly.

Best, Tan

On Mon, Aug 12, 2024 at 7:23 AM Zheng Changqing @.***> wrote:

Hi, I met a problem that "$ hickit -i ${OUTPUT}/structure/SRR25505058.impute.pairs_with_chromsize.gz -Sr1m -c1 -o out.pairs [M::hk_map_read] read 694017 pairs [E::hk_sd_ploidy_XY] multiple chr contain 'X' or 'Y' in names"

my hickit's version is r291

head -n 50 SRR25505058.impute.pairs_with_chromsize.gz: $ zcat filtered_pairs_with_chromsize.gz | head -n 50 pairs format v1.0

sorted: chr1-chr2-pos1-pos2

shape: upper triangle

chromsize: 1 197195432

chromsize: 2 181748087

chromsize: 3 159599783

chromsize: 4 155630120

chromsize: 5 152537259

chromsize: 6 149517037

chromsize: 7 152524553

chromsize: 8 131738871

chromsize: 9 124076172

chromsize: 10 129993255

chromsize: 11 121843856

chromsize: 12 121257530

chromsize: 13 120284312

chromsize: 14 125194864

chromsize: 15 103494974

chromsize: 16 98319150

chromsize: 17 95272651

chromsize: 18 90772031

chromsize: 19 61342430

chromsize: X 166650296

columns: readID chr1 pos1 chr2 pos2 strand1 strand2 phase0 phase1

phase_prob00 phase_prob01 phase_prob10 phase_prob11 . chr1 3003625 chr1 7400077 + + . . 0.836 0.008 0.002 0.154 . chr1 3009032 chr1 7403384 + + . . 0.839 0.008 0.002 0.151 . chr1 3012303 chr1 4114735 + + 1 1 0.000 0.000 0.000 1.000 . chr1 3018787 chr1 61903331 + + 1 . 0.000 0.000 0.004 0.996

tail -n 50 SRR25505058.impute.pairs_with_chromsize.gz: . chrX 169912765 chrX 169915788 + + 1 1 0.000 0.000 0.000 1.000 . chrX 170813085 chrX 170857943 + + 1 1 0.000 0.000 0.000 1.000 . chrX 31780402 chrY 4021769 + + 1 0 0.000 0.000 1.000 0.000 . chrX 100513028 chrY 9488953 + + 1 0 0.000 0.000 1.000 0.000 . chrX 120299636 chrY 2161589 + + 1 0 0.000 0.000 1.000 0.000 . chrX 142046394 chrY 2161622 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483022 chrY 4156800 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483028 chrY 4150242 + + 1 0 0.000 0.000 1.000 0.000 . chrY 142402 chrY 262227 + + 0 0 1.000 0.000 0.000 0.000 . chrY 258185 chrY 259206 + + 0 0 1.000 0.000 0.000 0.000 . chrY 259786 chrY 22211175 + + 0 0 1.000 0.000 0.000 0.000 . chrY 671941 chrY 768654 + + 0 0 1.000 0.000 0.000 0.000 . chrY 700407 chrY 737511 + + 0 0 1.000 0.000 0.000 0.000 . chrY 728659 chrY 89651070 + + 0 0 1.000 0.000 0.000 0.000 . chrY 732343 chrY 765350 + + 0 0 1.000 0.000 0.000 0.000 . chrY 764219 chrY 767145 + + 0 0 1.000 0.000 0.000 0.000

— Reply to this email directly, view it on GitHub https://github.com/lh3/hickit/issues/39, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASS7BZW3SQWO43GOGIFZRDZRDAPFAVCNFSM6AAAAABMMJJFAGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ3DCMJUHEZDSMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tanlongzhi commented 3 months ago

Also, you have only X in your header, but both chrX and chrY in your contacts. What sex is this cell? If female, you must remove all chrY; if male, you must add chrY to your header

On Mon, Aug 12, 2024 at 7:44 AM Longzhi Tan @.***> wrote:

Hi, you have a mismatch in chromosome names in your header (1, 2, 3 …) compared to the actual contacts (chr1, chr2, chr3 …). You must match them exactly.

Best, Tan

On Mon, Aug 12, 2024 at 7:23 AM Zheng Changqing @.***> wrote:

Hi, I met a problem that "$ hickit -i ${OUTPUT}/structure/SRR25505058.impute.pairs_with_chromsize.gz -Sr1m -c1 -o out.pairs [M::hk_map_read] read 694017 pairs [E::hk_sd_ploidy_XY] multiple chr contain 'X' or 'Y' in names"

my hickit's version is r291

head -n 50 SRR25505058.impute.pairs_with_chromsize.gz: $ zcat filtered_pairs_with_chromsize.gz | head -n 50 pairs format v1.0

sorted: chr1-chr2-pos1-pos2

shape: upper triangle

chromsize: 1 197195432

chromsize: 2 181748087

chromsize: 3 159599783

chromsize: 4 155630120

chromsize: 5 152537259

chromsize: 6 149517037

chromsize: 7 152524553

chromsize: 8 131738871

chromsize: 9 124076172

chromsize: 10 129993255

chromsize: 11 121843856

chromsize: 12 121257530

chromsize: 13 120284312

chromsize: 14 125194864

chromsize: 15 103494974

chromsize: 16 98319150

chromsize: 17 95272651

chromsize: 18 90772031

chromsize: 19 61342430

chromsize: X 166650296

columns: readID chr1 pos1 chr2 pos2 strand1 strand2 phase0 phase1

phase_prob00 phase_prob01 phase_prob10 phase_prob11 . chr1 3003625 chr1 7400077 + + . . 0.836 0.008 0.002 0.154 . chr1 3009032 chr1 7403384 + + . . 0.839 0.008 0.002 0.151 . chr1 3012303 chr1 4114735 + + 1 1 0.000 0.000 0.000 1.000 . chr1 3018787 chr1 61903331 + + 1 . 0.000 0.000 0.004 0.996

tail -n 50 SRR25505058.impute.pairs_with_chromsize.gz: . chrX 169912765 chrX 169915788 + + 1 1 0.000 0.000 0.000 1.000 . chrX 170813085 chrX 170857943 + + 1 1 0.000 0.000 0.000 1.000 . chrX 31780402 chrY 4021769 + + 1 0 0.000 0.000 1.000 0.000 . chrX 100513028 chrY 9488953 + + 1 0 0.000 0.000 1.000 0.000 . chrX 120299636 chrY 2161589 + + 1 0 0.000 0.000 1.000 0.000 . chrX 142046394 chrY 2161622 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483022 chrY 4156800 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483028 chrY 4150242 + + 1 0 0.000 0.000 1.000 0.000 . chrY 142402 chrY 262227 + + 0 0 1.000 0.000 0.000 0.000 . chrY 258185 chrY 259206 + + 0 0 1.000 0.000 0.000 0.000 . chrY 259786 chrY 22211175 + + 0 0 1.000 0.000 0.000 0.000 . chrY 671941 chrY 768654 + + 0 0 1.000 0.000 0.000 0.000 . chrY 700407 chrY 737511 + + 0 0 1.000 0.000 0.000 0.000 . chrY 728659 chrY 89651070 + + 0 0 1.000 0.000 0.000 0.000 . chrY 732343 chrY 765350 + + 0 0 1.000 0.000 0.000 0.000 . chrY 764219 chrY 767145 + + 0 0 1.000 0.000 0.000 0.000

— Reply to this email directly, view it on GitHub https://github.com/lh3/hickit/issues/39, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASS7BZW3SQWO43GOGIFZRDZRDAPFAVCNFSM6AAAAABMMJJFAGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ3DCMJUHEZDSMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zcq23 commented 3 months ago

Hi, you have a mismatch in chromosome names in your header (1, 2, 3 …) compared to the actual contacts (chr1, chr2, chr3 …). You must match them exactly. Best, Tan On Mon, Aug 12, 2024 at 7:23 AM Zheng Changqing @.> wrote: Hi, I met a problem that "$ hickit -i ${OUTPUT}/structure/SRR25505058.impute.pairs_with_chromsize.gz -Sr1m -c1 -o out.pairs [M::hk_map_read] read 694017 pairs [E::hk_sd_ploidy_XY] multiple chr contain 'X' or 'Y' in names" my hickit's version is r291 head -n 50 SRR25505058.impute.pairs_with_chromsize.gz: $ zcat filtered_pairs_with_chromsize.gz | head -n 50 pairs format v1.0 #sorted: chr1-chr2-pos1-pos2 #shape: upper triangle #chromsize: 1 197195432 #chromsize: 2 181748087 #chromsize: 3 159599783 #chromsize: 4 155630120 #chromsize: 5 152537259 #chromsize: 6 149517037 #chromsize: 7 152524553 #chromsize: 8 131738871 #chromsize: 9 124076172 #chromsize: 10 129993255 #chromsize: 11 121843856 #chromsize: 12 121257530 #chromsize: 13 120284312 #chromsize: 14 125194864 #chromsize: 15 103494974 #chromsize: 16 98319150 #chromsize: 17 95272651 #chromsize: 18 90772031 #chromsize: 19 61342430 #chromsize: X 166650296 #columns: readID chr1 pos1 chr2 pos2 strand1 strand2 phase0 phase1 phase_prob00 phase_prob01 phase_prob10 phase_prob11 . chr1 3003625 chr1 7400077 + + . . 0.836 0.008 0.002 0.154 . chr1 3009032 chr1 7403384 + + . . 0.839 0.008 0.002 0.151 . chr1 3012303 chr1 4114735 + + 1 1 0.000 0.000 0.000 1.000 . chr1 3018787 chr1 61903331 + + 1 . 0.000 0.000 0.004 0.996 tail -n 50 SRR25505058.impute.pairs_with_chromsize.gz: . chrX 169912765 chrX 169915788 + + 1 1 0.000 0.000 0.000 1.000 . chrX 170813085 chrX 170857943 + + 1 1 0.000 0.000 0.000 1.000 . chrX 31780402 chrY 4021769 + + 1 0 0.000 0.000 1.000 0.000 . chrX 100513028 chrY 9488953 + + 1 0 0.000 0.000 1.000 0.000 . chrX 120299636 chrY 2161589 + + 1 0 0.000 0.000 1.000 0.000 . chrX 142046394 chrY 2161622 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483022 chrY 4156800 + + 1 0 0.000 0.000 1.000 0.000 . chrX 143483028 chrY 4150242 + + 1 0 0.000 0.000 1.000 0.000 . chrY 142402 chrY 262227 + + 0 0 1.000 0.000 0.000 0.000 . chrY 258185 chrY 259206 + + 0 0 1.000 0.000 0.000 0.000 . chrY 259786 chrY 22211175 + + 0 0 1.000 0.000 0.000 0.000 . chrY 671941 chrY 768654 + + 0 0 1.000 0.000 0.000 0.000 . chrY 700407 chrY 737511 + + 0 0 1.000 0.000 0.000 0.000 . chrY 728659 chrY 89651070 + + 0 0 1.000 0.000 0.000 0.000 . chrY 732343 chrY 765350 + + 0 0 1.000 0.000 0.000 0.000 . chrY 764219 chrY 767145 + + 0 0 1.000 0.000 0.000 0.000 — Reply to this email directly, view it on GitHub <#39>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASS7BZW3SQWO43GOGIFZRDZRDAPFAVCNFSM6AAAAABMMJJFAGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ3DCMJUHEZDSMA . You are receiving this because you are subscribed to this thread.Message ID: @.>

ohhhhhhhhhh yes yes I see! I thought that the previous #chromsize referred to chr before😂. I will go ahead and make the changes to ensure they match correctly. Thank you very much for your response! It was very helpful for me!

zcq23 commented 3 months ago

Also, you have only X in your header, but both chrX and chrY in your contacts. What sex is this cell? If female, you must remove all chrY; if male, you must add chrY to your header On Mon, Aug 12, 2024 at 7:44 AM Longzhi Tan @.> wrote: Hi, you have a mismatch in chromosome names in your header (1, 2, 3 …) compared to the actual contacts (chr1, chr2, chr3 …). You must match them exactly. Best, Tan On Mon, Aug 12, 2024 at 7:23 AM Zheng Changqing @.> wrote: > Hi, I met a problem that "$ hickit -i > ${OUTPUT}/structure/SRR25505058.impute.pairs_with_chromsize.gz -Sr1m -c1 -o > out.pairs > [M::hk_map_read] read 694017 pairs > [E::hk_sd_ploidy_XY] multiple chr contain 'X' or 'Y' in names" > > my hickit's version is r291 > > head -n 50 SRR25505058.impute.pairs_with_chromsize.gz: > $ zcat filtered_pairs_with_chromsize.gz | head -n 50 > pairs format v1.0 > > #sorted: chr1-chr2-pos1-pos2 > #shape: upper triangle > #chromsize: 1 197195432 > #chromsize: 2 181748087 > #chromsize: 3 159599783 > #chromsize: 4 155630120 > #chromsize: 5 152537259 > #chromsize: 6 149517037 > #chromsize: 7 152524553 > #chromsize: 8 131738871 > #chromsize: 9 124076172 > #chromsize: 10 129993255 > #chromsize: 11 121843856 > #chromsize: 12 121257530 > #chromsize: 13 120284312 > #chromsize: 14 125194864 > #chromsize: 15 103494974 > #chromsize: 16 98319150 > #chromsize: 17 95272651 > #chromsize: 18 90772031 > #chromsize: 19 61342430 > #chromsize: X 166650296 > #columns: readID chr1 pos1 chr2 pos2 strand1 strand2 phase0 phase1 > phase_prob00 phase_prob01 phase_prob10 phase_prob11 > . chr1 3003625 chr1 7400077 + + . . 0.836 0.008 0.002 0.154 > . chr1 3009032 chr1 7403384 + + . . 0.839 0.008 0.002 0.151 > . chr1 3012303 chr1 4114735 + + 1 1 0.000 0.000 0.000 1.000 > . chr1 3018787 chr1 61903331 + + 1 . 0.000 0.000 0.004 0.996 > > tail -n 50 SRR25505058.impute.pairs_with_chromsize.gz: > . chrX 169912765 chrX 169915788 + + 1 1 0.000 0.000 0.000 1.000 > . chrX 170813085 chrX 170857943 + + 1 1 0.000 0.000 0.000 1.000 > . chrX 31780402 chrY 4021769 + + 1 0 0.000 0.000 1.000 0.000 > . chrX 100513028 chrY 9488953 + + 1 0 0.000 0.000 1.000 0.000 > . chrX 120299636 chrY 2161589 + + 1 0 0.000 0.000 1.000 0.000 > . chrX 142046394 chrY 2161622 + + 1 0 0.000 0.000 1.000 0.000 > . chrX 143483022 chrY 4156800 + + 1 0 0.000 0.000 1.000 0.000 > . chrX 143483028 chrY 4150242 + + 1 0 0.000 0.000 1.000 0.000 > . chrY 142402 chrY 262227 + + 0 0 1.000 0.000 0.000 0.000 > . chrY 258185 chrY 259206 + + 0 0 1.000 0.000 0.000 0.000 > . chrY 259786 chrY 22211175 + + 0 0 1.000 0.000 0.000 0.000 > . chrY 671941 chrY 768654 + + 0 0 1.000 0.000 0.000 0.000 > . chrY 700407 chrY 737511 + + 0 0 1.000 0.000 0.000 0.000 > . chrY 728659 chrY 89651070 + + 0 0 1.000 0.000 0.000 0.000 > . chrY 732343 chrY 765350 + + 0 0 1.000 0.000 0.000 0.000 > . chrY 764219 chrY 767145 + + 0 0 1.000 0.000 0.000 0.000 > > — > Reply to this email directly, view it on GitHub > <#39>, or unsubscribe > https://github.com/notifications/unsubscribe-auth/AASS7BZW3SQWO43GOGIFZRDZRDAPFAVCNFSM6AAAAABMMJJFAGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ3DCMJUHEZDSMA > . > You are receiving this because you are subscribed to this thread.Message > ID: @.***> >

oh yes my cell is male, I'm going to add the chrY to the header and continue with the next steps! Thank you very much indeed for your reply!!!!!!!!!!!!!!! Sincerely hope you have a wonderful day😊☀☀☀