caporaso-lab / sourcetracker2

SourceTracker2
BSD 3-Clause "New" or "Revised" License
60 stars 45 forks source link

leave-one-out (LOO) strategy TypeError: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind' #148

Open CXang opened 2 years ago

CXang commented 2 years ago

Because my data(source and sink) is relatively small(about 1*10^-7),I could use the original data to estimate the proportions of different sources to a sample of a sink.But I found that i can not use the original data to verify the accuracy of the model by using a leave-one-out (LOO) strategy,I probably know that it is due to the data problem, because I found that it can be calculated after turning the data into integers (ceiling 10^7, enlarged data), but I want to know if it is feasible to verify by this enlarged data set?

There are main two ways to use this script: (1) Estimating the proportions of different (microbial) sources to a sample of a (microbial) sink. (2) Using a leave-one-out (LOO) strategy, predict the metadata class of a given (microbial) sample.

I use the test dataset as an example:

original data:

OTU ID s4 s5 s7 s8 s9

o0 0.0000004 0.0000005 0.0000007 0.0000008 0.0000009 o1 0.0000014 0.0000015 0.0000017 0.0000018 0.0000019 o2 0.0000024 0.0000025 0.0000027 0.0000028 0.0000029 o3 0.0000034 0.0000035 0.0000037 0.0000038 0.0000039 o4 0.0000044 0.0000045 0.0000047 0.0000048 0.0000049 o5 0.0000054 0.0000055 0.0000057 0.0000058 0.0000059 o6 0.0000064 0.0000065 0.0000067 0.0000068 0.0000069 o7 0.0000074 0.0000075 0.0000077 0.0000078 0.0000079 o8 0.0000084 0.0000085 0.0000087 0.0000088 0.0000089 o9 0.0000094 0.0000095 0.0000097 0.0000098 0.0000099 o10 0.0000104 0.0000105 0.0000107 0.0000108 0.0000109 o11 0.0000114 0.0000115 0.0000117 0.0000118 0.0000119 o12 0.0000124 0.0000125 0.0000127 0.0000128 0.0000129 o13 0.0000134 0.0000135 0.0000137 0.0000138 0.0000139 o14 0.0000144 0.0000145 0.0000147 0.0000148 0.0000149 o15 0.0000154 0.0000155 0.0000157 0.0000158 0.0000159 o16 0.0000164 0.0000165 0.0000167 0.0000168 0.0000169 o17 0.0000174 0.0000175 0.0000177 0.0000178 0.0000179 o18 0.0000184 0.0000185 0.0000187 0.0000188 0.0000189 o19 0.0000194 0.0000195 0.0000197 0.0000198 0.0000199

enlarged data:

OTU ID s4 s5 s7 s8 s9

o0 4 5 7 8 9 o1 14 15 17 18 19 o2 24 25 27 28 29 o3 34 35 37 38 39 o4 44 45 47 48 49 o5 54 55 57 58 59 o6 64 65 67 68 69 o7 74 75 77 78 79 o8 84 85 87 88 89 o9 94 95 97 98 99 o10 104 105 107 108 109 o11 114 115 117 118 119 o12 124 125 127 128 129 o13 134 135 137 138 139 o14 144 145 147 148 149 o15 154 155 157 158 159 o16 164 165 167 168 169 o17 174 175 177 178 179 o18 184 185 187 188 189 o19 194 195 197 198 199

error:

Traceback (most recent call last): File "/home/chenp/miniconda2/envs/st2/bin/sourcetracker2", line 8, in sys.exit(cli()) File "/home/chenp/miniconda2/envs/st2/lib/python3.5/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/home/chenp/miniconda2/envs/st2/lib/python3.5/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/home/chenp/miniconda2/envs/st2/lib/python3.5/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/chenp/miniconda2/envs/st2/lib/python3.5/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/chenp/miniconda2/envs/st2/lib/python3.5/site-packages/click/core.py", line 610, in invoke return callback(args, **kwargs) File "/home/chenp/miniconda2/envs/st2/lib/python3.5/site-packages/sourcetracker/_cli/gibbs.py", line 214, in gibbs f(sample) File "/home/chenp/miniconda2/envs/st2/lib/python3.5/site-packages/sourcetracker/_sourcetracker.py", line 861, in _cli_loo_runner _sd[row] -= sink_data TypeError: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind'