seqscope / ficture
21 stars 2 forks source link

error when running example dataset #6

Closed lidanwu closed 4 months ago

lidanwu commented 4 months ago

Hello there, I am following the instructions here to install ficture and try to run the example dataset. But I ran into error when it tries to load the file it generated in earlier step.

ficture factor_report --path output1/analysis/nF12.d_18 --pref nF12.d_18.decode.prj_18.r_4_5 --color_table output1/analysis/nF12.d_18/figure/nF12.d_18.rgb.tsv
['0' '1' '2' '3' '4' '5' '6' '7' '8' '9' '10' '11']
ficture plot_pixel_full --input output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.pixel.sorted.tsv.gz --color_table output1/analysis/nF12.d_18/figure/nF12.d_18.rgb.tsv --output output1/analysis/nF12.d_18/figure/nF12.d_18.decode.prj_18.r_4_5.pixel.png --plot_um_per_pixel 0.5 --full
01:04:41 AM Background color 000000
01:04:41 AM Read color table (12)
Index(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11'], dtype='object', name='Name')
01:04:42 AM Read header {'K': 12, 'TOPK': 3, 'BLOCK_SIZE': 2000, 'BLOCK_AXIS': 'X', 'INDEX_AXIS': 'Y', 'OFFSET_X': 6690, 'OFFSET_Y': 6772, 'SIZE_X': '', 'SIZE_Y': '', 'SCALE': 100}
Traceback (most recent call last):
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/bin/ficture", line 8, in <module>
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/", line 43, in main
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/scripts/", line 65, in plot_pixel_full
    loader = BlockIndexedLoader(args.input, args.xmin, args.xmax, args.ymin, args.ymax, args.full, not args.org_coord, idtype=dty)
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/loaders/", line 47, in __init__
    self.xmax = min(xmax, self.meta["SIZE_X"])
TypeError: '<' not supported between instances of 'str' and 'float'
make: *** [output1/Makefile:52: output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.done] Error 1

Below is the full log, could someone help with this? Any advise would be appreciated!

(ficture) rstudio@01ffd5ba8129:~/NAS_data/lwu/github_drops/ficture$ ficture run_together --in-tsv examples/data/transcripts.tsv.gz     --in-minmax examples/data/coordinate_minmax.tsv     --in-feature examples/data/feature.clean.tsv.gz     --out-dir output1 --all
Creating minibatch from examples/data/transcripts.tsv.gz...
ficture make_spatial_minibatch --input examples/data/transcripts.tsv.gz --output output1/batched.matrix.tsv --mu_scale 1.0 --batch_size 500 --batch_buff 30 --major_axis Y
INFO:root:Random seed 1714611393.957012
['X', 'random_index', 'Y', 'gene', 'MoleculeID', 'Count']
INFO:root:Read blocks of pixels: 674.99 x 267.61
INFO:root:Read blocks of pixels: 674.99 x 539.15
INFO:root:Output region (675.99, 540.15) (6689.0, 7365.0) x (6771.0, 7311.1)
INFO:root:Left over size 74443 (74443, 29.99)
INFO:root:Read blocks of pixels: 674.99 x 165.84
INFO:root:Read blocks of pixels: 674.99 x 165.84
INFO:root:Output region (675.99, 166.84) (6689.0, 7365.0) x (7280.2, 7447.0)
sort -k 2,2n -k 1,1g output1/batched.matrix.tsv | gzip -c > output1/batched.matrix.tsv.gz
rm output1/batched.matrix.tsv
Creating DGE for 18um...
ficture make_dge --key Count --count_header Count --input examples/data/transcripts.tsv.gz --output output1/hexagon.d_18.tsv --hex_width 18 --n_move 2 --min_ct_per_unit 50 --mu_scale 1.0 --precision 2 --major_axis Y
INFO:root:Random seed 1714611480.9238439
['X', 'Y', 'gene', 'MoleculeID', 'Count']
INFO:root:Processing 999568 pixels (1000000 6772.0, 7039.61).
INFO:root:Sliding offset 0, 0. Add 582 units, median count 1553.0, 582 units so far.
INFO:root:Sliding offset 0, 1. Add 583 units, median count 1537.0, 1165 units so far.
INFO:root:Sliding offset 1, 0. Add 588 units, median count 1543.5, 1753 units so far.
INFO:root:Sliding offset 1, 1. Add 579 units, median count 1534.0, 2332 units so far.
INFO:root:Left over size 148883 (7004, 7040).
INFO:root:Processing 1148450 pixels (1148883 7003.62, 7311.15).
INFO:root:Sliding offset 0, 0. Add 652 units, median count 1540.5, 2984 units so far.
INFO:root:Sliding offset 0, 1. Add 627 units, median count 1594.0, 3611 units so far.
INFO:root:Sliding offset 1, 0. Add 656 units, median count 1551.0, 4267 units so far.
INFO:root:Sliding offset 1, 1. Add 632 units, median count 1558.0, 4899 units so far.
INFO:root:Left over size 89418 (7275, 7311).
INFO:root:Processing 551580 pixels (551789 7275.16, 7447.0).
INFO:root:Sliding offset 0, 0. Add 319 units, median count 1303.5, 5218 units so far.
INFO:root:Sliding offset 0, 1. Add 347 units, median count 1316.0, 5565 units so far.
INFO:root:Sliding offset 1, 0. Add 316 units, median count 1329.5, 5881 units so far.
INFO:root:Sliding offset 1, 1. Add 346 units, median count 1289.0, 6227 units so far.
INFO:root:Left over size 133240 (7411, 7447).
sort -k 1,1n output1/hexagon.d_18.tsv | gzip -c > output1/hexagon.d_18.tsv.gz
rm output1/hexagon.d_18.tsv
Creating LDA for 18um and 12 factors...
mkdir -p output1/analysis/nF12.d_18/figure
ficture fit_model --input output1/hexagon.d_18.tsv.gz --output output1/analysis/nF12.d_18/nF12.d_18 --feature examples/data/feature.clean.tsv.gz --nFactor 12 --epoch 3 --epoch_id_length 2 --unit_attr X Y --key Count --min_ct_per_feature 20 --test_split 0.5 --R 10 --thread 1
INFO:root:Read data with 1552 units, 347 features
INFO:root:0: -6903.32, -6831.96
INFO:root:Counter({10: 227, 2: 189, 11: 117, 8: 73, 4: 53, 6: 42, 3: 33, 9: 29, 7: 12, 1: 1})
INFO:root:R=0, 31.47, 32.04, 3.00s
INFO:root:1: -6900.46, -6829.79
INFO:root:Counter({10: 229, 2: 190, 11: 116, 8: 70, 6: 49, 4: 41, 3: 37, 9: 22, 7: 19, 1: 2, 0: 1})
INFO:root:R=1, 31.29, 32.14, 2.98s
INFO:root:2: -6903.65, -6832.32
INFO:root:Counter({10: 233, 2: 178, 11: 130, 8: 85, 3: 46, 4: 40, 9: 27, 6: 21, 7: 14, 1: 1, 0: 1})
INFO:root:R=2, 30.76, 32.56, 3.00s
INFO:root:3: -6902.74, -6831.99
INFO:root:Counter({10: 229, 2: 185, 11: 117, 8: 73, 4: 51, 6: 46, 3: 34, 9: 32, 7: 7, 0: 1, 1: 1})
INFO:root:R=3, 31.18, 32.59, 2.99s
INFO:root:4: -6904.73, -6833.63
INFO:root:Counter({10: 216, 2: 195, 11: 110, 8: 94, 6: 43, 3: 42, 4: 38, 9: 27, 7: 9, 1: 1, 0: 1})
INFO:root:R=4, 30.78, 31.49, 3.01s
INFO:root:5: -6902.37, -6831.62
INFO:root:Counter({10: 227, 2: 192, 11: 116, 8: 88, 6: 38, 4: 36, 3: 34, 9: 32, 7: 10, 1: 3})
INFO:root:R=5, 30.64, 31.72, 2.99s
INFO:root:6: -6903.47, -6832.38
INFO:root:Counter({10: 230, 2: 187, 11: 119, 8: 89, 4: 38, 6: 37, 3: 36, 9: 30, 7: 8, 1: 2})
INFO:root:R=6, 30.40, 31.53, 3.00s
INFO:root:7: -6903.93, -6832.64
INFO:root:Counter({10: 230, 2: 188, 11: 133, 8: 72, 4: 43, 3: 37, 9: 29, 6: 27, 7: 15, 1: 2})
INFO:root:R=7, 30.08, 30.98, 2.99s
INFO:root:8: -6901.82, -6830.87
INFO:root:Counter({10: 228, 2: 189, 11: 116, 8: 74, 6: 54, 4: 49, 9: 29, 3: 27, 7: 8, 0: 1, 1: 1})
INFO:root:R=8, 30.55, 32.29, 3.05s
INFO:root:9: -6902.67, -6831.69
INFO:root:Counter({10: 224, 2: 201, 11: 115, 8: 84, 6: 39, 3: 38, 4: 34, 9: 28, 7: 9, 1: 4})
INFO:root:R=9, 31.06, 32.19, 3.01s
0    31.467043
1    31.286540
3    31.177941
9    31.057617
4    30.781998
2    30.763676
5    30.636084
8    30.551041
6    30.399987
7    30.084390
Name: Score, dtype: float64
/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/loaders/ FutureWarning: The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.
  self.brc = self.brc.merge(right = self.df.groupby(by='unit').agg({self.train_key:sum}).reset_index(), on = 'unit', how = 'inner' )
INFO:root:Result file output1/analysis/nF12.d_18/nF12.d_18.fit_result.tsv.gz
ficture choose_color --input output1/analysis/nF12.d_18/nF12.d_18.fit_result.tsv.gz --output output1/analysis/nF12.d_18/figure/nF12.d_18 --cmap_name turbo
ficture plot_base --input output1/analysis/nF12.d_18/nF12.d_18.fit_result.tsv.gz --output output1/analysis/nF12.d_18/figure/nF12.d_18.coarse --fill_range 10.0 --color_table output1/analysis/nF12.d_18/figure/nF12.d_18.rgb.tsv --plot_um_per_pixel 1 --plot_discretized
/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/scripts/ FutureWarning: The provided callable <function mean at 0x7fca3a282ac0> is currently using SeriesGroupBy.mean. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "mean" instead.
  chunk = chunk.groupby(by = ['x_indx', 'y_indx']).agg({ x:np.mean for x in factor_header }).reset_index()
/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/scripts/ FutureWarning: The provided callable <function mean at 0x7fca3a282ac0> is currently using SeriesGroupBy.mean. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "mean" instead.
  df = df.groupby(by = ['x_indx', 'y_indx']).agg({ x:np.mean for x in factor_header }).reset_index()
INFO:root:Read region 1552 pixels in region 685.0 x 640.0
640 500
INFO:root:Filling pixels 0 - 500 / 639
INFO:root:Filling pixels 500 - 639 / 639
INFO:root:Start constructing RGB image
INFO:root:Made fractional image
INFO:root:Made hard threshold image
touch output1/analysis/nF12.d_18/nF12.d_18.done
Creating projection for 18um and 12 factors, at 18um
Performing pixel-level decoding..
Sorting and reformatting the pixel-level output..
Performing pseudo-bulk differential expression analysis..
Drawing pixel-level output image...
ficture transform --input examples/data/transcripts.tsv.gz --output_pref output1/analysis/nF12.d_18/nF12.d_18.prj_18.r_4 --model output1/analysis/nF12.d_18/nF12.d_18.model.p --key Count --major_axis Y --hex_width 18 --n_move 4 --min_ct_per_unit 20 --mu_scale 1.0 --thread 1 --precision 2
INFO:root:Model loaded with 347 features and 12 factors
INFO:root:Transformed 1 batches with total 9453 units, 0.412512min
INFO:root:Transformed 2 batches with total 19909 units, 0.880552min
INFO:root:Transformed 3 batches with total 25256 units, 1.121247min
ficture slda_decode --input output1/batched.matrix.tsv.gz --output output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5 --model output1/analysis/nF12.d_18/nF12.d_18.model.p --anchor output1/analysis/nF12.d_18/nF12.d_18.prj_18.r_4.fit_result.tsv.gz --anchor_in_um --neighbor_radius 5 --mu_scale 1.0 --key Count --precision 0.1 --lite_topk_output_pixel 3 --lite_topk_output_anchor 3 --thread 1
INFO:root:347 genes and 12 factors are read from input model
INFO:root:Read 25106 grid points
INFO:root:Read 499227 pixels, forming 1 batches.
INFO:root:Read 1 batches ((499227, 347))
INFO:root:Output 462033 pixels and 5691 anchors
INFO:root:Read 1893871 pixels, forming 1 batches.
INFO:root:Read 1 batches ((1893871, 347))
INFO:root:Output 1893871 pixels and 20096 anchors
bash output1/ output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.pixel.tsv.gz output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.pixel.sorted.tsv.gz examples/data/coordinate_minmax.tsv nF12.d_18 100 100 3 bgzip tabix
6690, 7365; 6772, 7447
output1/ line 20: bc: command not found
output1/ line 21: bc: command not found
ficture de_bulk --input output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.posterior.count.tsv.gz --output output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.bulk_chisq.tsv --min_ct_per_feature 20 --max_pval_output 0.001 --min_fold_output 1.5 --thread 1
Read posterior count over 347 genes and 12 factors
Testing 347 genes over 12 factors
ficture factor_report --path output1/analysis/nF12.d_18 --pref nF12.d_18.decode.prj_18.r_4_5 --color_table output1/analysis/nF12.d_18/figure/nF12.d_18.rgb.tsv
['0' '1' '2' '3' '4' '5' '6' '7' '8' '9' '10' '11']
ficture plot_pixel_full --input output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.pixel.sorted.tsv.gz --color_table output1/analysis/nF12.d_18/figure/nF12.d_18.rgb.tsv --output output1/analysis/nF12.d_18/figure/nF12.d_18.decode.prj_18.r_4_5.pixel.png --plot_um_per_pixel 0.5 --full
01:04:41 AM Background color 000000
01:04:41 AM Read color table (12)
Index(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11'], dtype='object', name='Name')
01:04:42 AM Read header {'K': 12, 'TOPK': 3, 'BLOCK_SIZE': 2000, 'BLOCK_AXIS': 'X', 'INDEX_AXIS': 'Y', 'OFFSET_X': 6690, 'OFFSET_Y': 6772, 'SIZE_X': '', 'SIZE_Y': '', 'SCALE': 100}
Traceback (most recent call last):
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/bin/ficture", line 8, in <module>
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/", line 43, in main
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/scripts/", line 65, in plot_pixel_full
    loader = BlockIndexedLoader(args.input, args.xmin, args.xmax, args.ymin, args.ymax, args.full, not args.org_coord, idtype=dty)
  File "/home/rstudio/.rminiconda/ficture_python/envs/ficture/lib/python3.11/site-packages/ficture/loaders/", line 47, in __init__
    self.xmax = min(xmax, self.meta["SIZE_X"])
TypeError: '<' not supported between instances of 'str' and 'float'
make: *** [output1/Makefile:52: output1/analysis/nF12.d_18/nF12.d_18.decode.prj_18.r_4_5.done] Error 1
(ficture) rstudio@01ffd5ba8129:~/NAS_data/lwu/github_drops/ficture$ 
Yichen-Si commented 4 months ago

It was because your environment does not have the linux command "bc" so the output "nF12.d_18.decode.prj_18.r_4_5.pixel.sorted.tsv.gz" missing some meta data. Could you get bc somehow? Or you can check "" and modify it to do the math differently. (If you check the first few lines in "nF12.d_18.decode.prj_18.r_4_5.pixel.sorted.tsv.gz" you will see it misses "SIZE_X" and "SIZE_Y".)

lidanwu commented 4 months ago

Thank you for the response! I installed the bc library and the example dataset ran without issue now. A side note, it might be good to add bc in the installation instructions. I am using aws EC2 instance which doesn't come with any of linux libraries pre-installed. It was helpful that this webpage tells me to install htslib for the bgzip and tabix libraries. But it doesn't mention the bc.