parklab / MosaicForecast

A mosaic detecting software based on phasing and random forest
MIT License
62 stars 21 forks source link

Error: "Wrong number of items passed" #16

Closed gevro closed 3 years ago

gevro commented 3 years ago

Hi, I'm getting this error from this command. How do I fix this? Note: this seems to be the same as #10, but my command was correct, so that cannot be the explanation. Thanks!

Docker: yanmei/mosaicforecast:0.0.1

python ReadLevel_Features_extraction.py input.bed sample.features bam_dir input/Homo_sapiens_assembly38.fasta input/k24.umap.sorted.bw 2 bam
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'querypos_p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1071, in set
    loc = self.items.get_loc(item)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'querypos_p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ReadLevel_Features_extraction.py", line 984, in <module>
    df['querypos_p']=df.apply(lambda row: my_wilcox_pvalue(row['querypos_major'], row['querypos_minor']), axis=1)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 2938, in __setitem__
    self._set_item(key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 3001, in _set_item
    NDFrame._set_item(self, key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 3624, in _set_item
    self._data.set(key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1074, in set
    self.insert(len(self.items), item, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1181, in insert
    block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1))
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 3047, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 2595, in __init__
    super().__init__(values, ndim=ndim, placement=placement)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 125, in __init__
    f"Wrong number of items passed {len(self.values)}, "
ValueError: Wrong number of items passed 38, placement implies 1

First few lines of tmp file output:

id querypos_major querypos_minor leftpos_major leftpos_minor seqpos_major seqpos_minor mapq_major mapq_minor baseq_major baseq_minor baseq_major_near1b baseq_minor_near1b major_plus major_minus minor_plus minor_minus context1 context2 context1_count context2_count mismatches_major mismatches_minor major_read1 major_read2 minor_read1 minor_read2 dp_near dp_far conflict_num mappability type length GCcontent ref_softclip alt_softclip indel_proportion_SNPonly alt2_proportion_SNPonly
sample~chr1~1107734~A~C 143,142,138,114,110,109,99,87,83,72,65,61,40,38,37,17,7, 149,146,131,126,109,78,18,16,12,11,10, 1107590,1107591,1107595,1107619,1107623,1107624,1107634,1107646,1107650,1107661,1107668,1107672,1107693,1107695,1107696,1107716,1107726, 1107584,1107587,1107602,1107607,1107624,1107655,1107715,1107717,1107721,1107722,1107723, , , 60,60,56,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60,60,60,60,60,60,60, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30, , , 0 0 0 0 AAC GTT 0 0 , , 0 0 0 0 30.714285714285715 36.0 0 0.25 SNP 0 0.6666666666666666 0.0 0.0 0.0 0.0
sample~chr1~1894606~G~C 143,132,130,118,117,116,113,109,109,103,102,102,100,69,68,52,47,20,5,0, 148,138,122,112,109,105,102,34,21,14,7,4, 1894462,1894473,1894475,1894487,1894488,1894489,1894492,1894496,1894496,1894502,1894503,1894503,1894505,1894536,1894537,1894553,1894558,1894585,1894600,1894605, 1894457,1894467,1894483,1894493,1894496,1894500,1894503,1894571,1894584,1894591,1894598,1894601, , , 60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60,60,60,60,60,60,60,60, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 60,30,30,30,30,30,30,30,30,30,30,30, , , 0 0 0 0 AGC GCT 0 0 , , 0 0 0 0 36.142857142857146 39.375 0 1.0 SNP 0 0.7142857142857143 0.0 0.0 0.0 0.0
sample~chr1~1968440~A~C 143,142,134,132,128,117,110,86,82,79,78,72,57,50,45,40,24,19, 148,147,140,91,69, 1968296,1968297,1968305,1968307,1968311,1968322,1968329,1968353,1968357,1968360,1968361,1968367,1968382,1968389,1968394,1968399,1968415,1968420, 1968291,1968292,1968299,1968348,1968370, , , 60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60, 30,16,30,30,30,30,30,30,30,30,30,20,30,30,30,30,30,30, 20,20,20,20,20, , , 0 0 0 0 CAT ATG 0 0 , , 0 0 0 0 30.428571428571427 35.125 0 1.0 SNP 0 0.6190476190476191 0.0 0.0 0.0 0.0

I'm trying to figure out where the bug is, and I found that after these lines, each of them individually, causes the 'df' dataframe to become empty. That is the source of the error, but I'm not sure why this is happening.

df = df[df.seqpos_minor != ',']
df = df[df.seqpos_major != ',']
df = df[df.baseq_minor_near1b != ',']

df = df[df.baseq_major_near1b != ',']
douym commented 3 years ago

Hi, I'm getting this error from this command. How do I fix this? Note: this seems to be the same as #10, but my command was correct, so that cannot be the explanation. Thanks!

Docker: yanmei/mosaicforecast:0.0.1

python ReadLevel_Features_extraction.py input.bed sample.features bam_dir input/Homo_sapiens_assembly38.fasta input/k24.umap.sorted.bw 2 bam
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'querypos_p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1071, in set
    loc = self.items.get_loc(item)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'querypos_p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ReadLevel_Features_extraction.py", line 984, in <module>
    df['querypos_p']=df.apply(lambda row: my_wilcox_pvalue(row['querypos_major'], row['querypos_minor']), axis=1)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 2938, in __setitem__
    self._set_item(key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 3001, in _set_item
    NDFrame._set_item(self, key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 3624, in _set_item
    self._data.set(key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1074, in set
    self.insert(len(self.items), item, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1181, in insert
    block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1))
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 3047, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 2595, in __init__
    super().__init__(values, ndim=ndim, placement=placement)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 125, in __init__
    f"Wrong number of items passed {len(self.values)}, "
ValueError: Wrong number of items passed 38, placement implies 1

First few lines of tmp file output:

id querypos_major querypos_minor leftpos_major leftpos_minor seqpos_major seqpos_minor mapq_major mapq_minor baseq_major baseq_minor baseq_major_near1b baseq_minor_near1b major_plus major_minus minor_plus minor_minus context1 context2 context1_count context2_count mismatches_major mismatches_minor major_read1 major_read2 minor_read1 minor_read2 dp_near dp_far conflict_num mappability type length GCcontent ref_softclip alt_softclip indel_proportion_SNPonly alt2_proportion_SNPonly
sample~chr1~1107734~A~C 143,142,138,114,110,109,99,87,83,72,65,61,40,38,37,17,7, 149,146,131,126,109,78,18,16,12,11,10, 1107590,1107591,1107595,1107619,1107623,1107624,1107634,1107646,1107650,1107661,1107668,1107672,1107693,1107695,1107696,1107716,1107726, 1107584,1107587,1107602,1107607,1107624,1107655,1107715,1107717,1107721,1107722,1107723, , , 60,60,56,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60,60,60,60,60,60,60, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30, , , 0 0 0 0 AAC GTT 0 0 , , 0 0 0 0 30.714285714285715 36.0 0 0.25 SNP 0 0.6666666666666666 0.0 0.0 0.0 0.0
sample~chr1~1894606~G~C 143,132,130,118,117,116,113,109,109,103,102,102,100,69,68,52,47,20,5,0, 148,138,122,112,109,105,102,34,21,14,7,4, 1894462,1894473,1894475,1894487,1894488,1894489,1894492,1894496,1894496,1894502,1894503,1894503,1894505,1894536,1894537,1894553,1894558,1894585,1894600,1894605, 1894457,1894467,1894483,1894493,1894496,1894500,1894503,1894571,1894584,1894591,1894598,1894601, , , 60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60,60,60,60,60,60,60,60, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 60,30,30,30,30,30,30,30,30,30,30,30, , , 0 0 0 0 AGC GCT 0 0 , , 0 0 0 0 36.142857142857146 39.375 0 1.0 SNP 0 0.7142857142857143 0.0 0.0 0.0 0.0
sample~chr1~1968440~A~C 143,142,134,132,128,117,110,86,82,79,78,72,57,50,45,40,24,19, 148,147,140,91,69, 1968296,1968297,1968305,1968307,1968311,1968322,1968329,1968353,1968357,1968360,1968361,1968367,1968382,1968389,1968394,1968399,1968415,1968420, 1968291,1968292,1968299,1968348,1968370, , , 60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60, 30,16,30,30,30,30,30,30,30,30,30,20,30,30,30,30,30,30, 20,20,20,20,20, , , 0 0 0 0 CAT ATG 0 0 , , 0 0 0 0 30.428571428571427 35.125 0 1.0 SNP 0 0.6190476190476191 0.0 0.0 0.0 0.0

I'm trying to figure out where the bug is, and I found that after these lines, each of them individually, causes the 'df' dataframe to become empty. That is the source of the error, but I'm not sure why this is happening.

df = df[df.seqpos_minor != ',']
df = df[df.seqpos_major != ',']
df = df[df.baseq_minor_near1b != ',']

df = df[df.baseq_major_near1b != ',']

Hi,

Thanks for your interest in MosaicForecast! Have you checked the format of your input.bed, does it start with "chr" as hg38? and "input/k24.umap.sorted.bw" is formated with hg19.

Best,

Yanmei

gevro commented 3 years ago

Hi! Yes input.bed and k24.umap.sorted.bw are both from hg38 with "chr#" notation for chromosomes. So that cannot be the problem.

What else could be the issue?

$ head input.bed
chr1    2384860 2384861 C   T   sample
chr1    5960549 5960550 A   G   sample
chr1    8068981 8068982 A   C   sample
chr1    20021374    20021375    A   G   sample
chr1    34866510    34866511    G   A   sample
chr1    39823543    39823544    T   A   sample
chr1    40907253    40907254    C   T   sample
$ wiggletools write_bg - k24.umap.sorted.bw | head
chr1    10157   10158   0.000000
chr1    10158   10159   0.042000
chr1    10159   10160   0.042000
chr1    10160   10161   0.042000
chr1    10161   10162   0.042000
chr1    10162   10163   0.042000
chr1    10163   10164   0.042000
chr1    10164   10165   0.042000
douym commented 3 years ago

Hi, I'm getting this error from this command. How do I fix this? Note: this seems to be the same as #10, but my command was correct, so that cannot be the explanation. Thanks! Docker: yanmei/mosaicforecast:0.0.1

python ReadLevel_Features_extraction.py input.bed sample.features bam_dir input/Homo_sapiens_assembly38.fasta input/k24.umap.sorted.bw 2 bam
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'querypos_p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1071, in set
    loc = self.items.get_loc(item)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'querypos_p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ReadLevel_Features_extraction.py", line 984, in <module>
    df['querypos_p']=df.apply(lambda row: my_wilcox_pvalue(row['querypos_major'], row['querypos_minor']), axis=1)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 2938, in __setitem__
    self._set_item(key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 3001, in _set_item
    NDFrame._set_item(self, key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 3624, in _set_item
    self._data.set(key, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1074, in set
    self.insert(len(self.items), item, value)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1181, in insert
    block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1))
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 3047, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 2595, in __init__
    super().__init__(values, ndim=ndim, placement=placement)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 125, in __init__
    f"Wrong number of items passed {len(self.values)}, "
ValueError: Wrong number of items passed 38, placement implies 1

First few lines of tmp file output:

id querypos_major querypos_minor leftpos_major leftpos_minor seqpos_major seqpos_minor mapq_major mapq_minor baseq_major baseq_minor baseq_major_near1b baseq_minor_near1b major_plus major_minus minor_plus minor_minus context1 context2 context1_count context2_count mismatches_major mismatches_minor major_read1 major_read2 minor_read1 minor_read2 dp_near dp_far conflict_num mappability type length GCcontent ref_softclip alt_softclip indel_proportion_SNPonly alt2_proportion_SNPonly
sample~chr1~1107734~A~C 143,142,138,114,110,109,99,87,83,72,65,61,40,38,37,17,7, 149,146,131,126,109,78,18,16,12,11,10, 1107590,1107591,1107595,1107619,1107623,1107624,1107634,1107646,1107650,1107661,1107668,1107672,1107693,1107695,1107696,1107716,1107726, 1107584,1107587,1107602,1107607,1107624,1107655,1107715,1107717,1107721,1107722,1107723, , , 60,60,56,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60,60,60,60,60,60,60, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 30,30,30,30,30,30,30,30,30,30,30, , , 0 0 0 0 AAC GTT 0 0 , , 0 0 0 0 30.714285714285715 36.0 0 0.25 SNP 0 0.6666666666666666 0.0 0.0 0.0 0.0
sample~chr1~1894606~G~C 143,132,130,118,117,116,113,109,109,103,102,102,100,69,68,52,47,20,5,0, 148,138,122,112,109,105,102,34,21,14,7,4, 1894462,1894473,1894475,1894487,1894488,1894489,1894492,1894496,1894496,1894502,1894503,1894503,1894505,1894536,1894537,1894553,1894558,1894585,1894600,1894605, 1894457,1894467,1894483,1894493,1894496,1894500,1894503,1894571,1894584,1894591,1894598,1894601, , , 60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60,60,60,60,60,60,60,60, 30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30, 60,30,30,30,30,30,30,30,30,30,30,30, , , 0 0 0 0 AGC GCT 0 0 , , 0 0 0 0 36.142857142857146 39.375 0 1.0 SNP 0 0.7142857142857143 0.0 0.0 0.0 0.0
sample~chr1~1968440~A~C 143,142,134,132,128,117,110,86,82,79,78,72,57,50,45,40,24,19, 148,147,140,91,69, 1968296,1968297,1968305,1968307,1968311,1968322,1968329,1968353,1968357,1968360,1968361,1968367,1968382,1968389,1968394,1968399,1968415,1968420, 1968291,1968292,1968299,1968348,1968370, , , 60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60, 60,60,60,60,60, 30,16,30,30,30,30,30,30,30,30,30,20,30,30,30,30,30,30, 20,20,20,20,20, , , 0 0 0 0 CAT ATG 0 0 , , 0 0 0 0 30.428571428571427 35.125 0 1.0 SNP 0 0.6190476190476191 0.0 0.0 0.0 0.0

I'm trying to figure out where the bug is, and I found that after these lines, each of them individually, causes the 'df' dataframe to become empty. That is the source of the error, but I'm not sure why this is happening.

df = df[df.seqpos_minor != ',']
df = df[df.seqpos_major != ',']
df = df[df.baseq_minor_near1b != ',']

df = df[df.baseq_major_near1b != ',']

Hi,

Thanks for your interest in MosaicForecast! Have you checked the format of your input.bed, does it start with "chr" as hg38? and "input/k24.umap.sorted.bw" is formated with hg19.

Best,

Yanmei

Hi! Yes input.bed and k24.umap.sorted.bw are both from hg38 with "chr#" notation for chromosomes. So that cannot be the problem.

What else could be the issue?

$ head input.bed
chr1  2384860 2384861 C   T   sample
chr1  5960549 5960550 A   G   sample
chr1  8068981 8068982 A   C   sample
chr1  20021374    20021375    A   G   sample
chr1  34866510    34866511    G   A   sample
chr1  39823543    39823544    T   A   sample
chr1  40907253    40907254    C   T   sample
$ wiggletools write_bg - k24.umap.sorted.bw | head
chr1  10157   10158   0.000000
chr1  10158   10159   0.042000
chr1  10159   10160   0.042000
chr1  10160   10161   0.042000
chr1  10161   10162   0.042000
chr1  10162   10163   0.042000
chr1  10163   10164   0.042000
chr1  10164   10165   0.042000

Hi @gevro ,

Could you send me a slice of your bam file so I could test it?

Thanks!

Yanmei

douym commented 3 years ago

Hi! Yes input.bed and k24.umap.sorted.bw are both from hg38 with "chr#" notation for chromosomes. So that cannot be the problem.

What else could be the issue?

$ head input.bed
chr1  2384860 2384861 C   T   sample
chr1  5960549 5960550 A   G   sample
chr1  8068981 8068982 A   C   sample
chr1  20021374    20021375    A   G   sample
chr1  34866510    34866511    G   A   sample
chr1  39823543    39823544    T   A   sample
chr1  40907253    40907254    C   T   sample
$ wiggletools write_bg - k24.umap.sorted.bw | head
chr1  10157   10158   0.000000
chr1  10158   10159   0.042000
chr1  10159   10160   0.042000
chr1  10160   10161   0.042000
chr1  10161   10162   0.042000
chr1  10162   10163   0.042000
chr1  10163   10164   0.042000
chr1  10164   10165   0.042000

hi @gevro ,

One possible reason is that MosaicForecast now take paired end reads... Is it possible that your reads are single-end reads? If yes, I could modify a version for you to use.

Best,

Yanmei

gevro commented 3 years ago

Hi, Sorry for the late reply. I found that my BAM files did not have the 'NM' tag. Could that be the issue? I am now making BAM files with the 'NM' tag to see if that will work.

douym commented 3 years ago

Hi, Sorry for the late reply. I found that my BAM files did not have the 'NM' tag. Could that be the issue? I am now making BAM files with the 'NM' tag to see if that will work.

Hi @gevro ,

If seqpos_minor and seqpos_major returns blank results, it's most probably because all of the reads are not proper paired, because the criteria to calculate seqpos is that "pileupread.alignment.is_proper_pair".

NM is also a tag that I used. Sorry for the inconvenience caused. Maybe I should reinvent the wheels instead of assuming all bam files are in the same format...

Best wishes,

Yanmei

gevro commented 3 years ago

I checked and most alignments are proper paired, so that cannot be the issue.

gevro commented 3 years ago

Confirmed that a BAM file with the "NM" tag solves the problem.

douym commented 3 years ago

Confirmed that a BAM file with the "NM" tag solves the problem.

Great! Thanks for notifying! :)