Open krabapple opened 3 years ago
I am trying to understand how STAR (v 2.7.6a ) calculates genome length. My genome consists of 219 contigs. I compute their total length to be 181467262 bp . STAR says this (from the Log.out file):
`genomeFileSizes 234094592 1496773479 ~RE-DEFINED Genome version is compatible with current STAR Number of real (reference) chromosomes= 219 1 Contig0 40212017 0 2 Contig1 34657275 40370176 3 Contig2 26081721 75235328 4 Contig3 27738065 101449728 5 Contig4 20331543 129236992 6 Contig5 27726387 149684224 7 Contig6 4123 177471488 8 Contig7 6891 177733632 9 Contig8 4968 177995776 10 Contig9 9349 178257920 11 Contig10 8113 178520064 12 Contig11 22343 178782208 13 Contig12 3023 179044352 14 Contig13 57747 179306496 15 Contig14 5207 179568640 16 Contig15 4456 179830784 17 Contig16 2809 180092928 18 Contig17 18317 180355072 19 Contig18 9604 180617216 20 Contig19 1029 180879360 21 Contig20 946 181141504 22 Contig21 736 181403648 23 Contig22 13420 181665792 24 Contig23 5175 181927936 25 Contig24 3763 182190080 26 Contig25 3688 182452224 27 Contig26 9035 182714368 28 Contig27 14061 182976512 29 Contig28 3465 183238656 30 Contig29 2539 183500800 31 Contig30 9997 183762944 32 Contig31 3707 184025088 33 Contig32 5692 184287232 34 Contig33 7833 184549376 35 Contig34 19122 184811520 36 Contig35 2519 185073664 37 Contig36 97402 185335808 38 Contig37 8463 185597952 39 Contig38 2525 185860096 40 Contig39 10299 186122240 41 Contig40 22957 186384384 42 Contig41 24494 186646528 43 Contig42 10465 186908672 44 Contig43 7981 187170816 45 Contig44 109354 187432960 46 Contig45 2743 187695104 47 Contig46 1134 187957248 48 Contig47 4178 188219392 49 Contig48 3189 188481536 50 Contig49 7148 188743680 51 Contig50 25490 189005824 52 Contig51 55442 189267968 53 Contig52 5726 189530112 54 Contig53 6598 189792256 55 Contig54 509 190054400 56 Contig55 481 190316544 57 Contig56 28446 190578688 58 Contig57 1310 190840832 59 Contig58 5254 191102976 60 Contig59 3222 191365120 61 Contig60 13047 191627264 62 Contig61 4248 191889408 63 Contig62 5806 192151552 64 Contig63 21103 192413696 65 Contig64 8845 192675840 66 Contig65 2843 192937984 67 Contig66 16958 193200128 68 Contig67 2798 193462272 69 Contig68 4790 193724416 70 Contig69 8856 193986560 71 Contig70 384 194248704 72 Contig71 4094 194510848 73 Contig72 1913 194772992 74 Contig73 18 195035136 75 Contig74 3074 195297280 76 Contig75 1540 195559424 77 Contig76 14406 195821568 78 Contig77 6335 196083712 79 Contig78 6278 196345856 80 Contig79 71697 196608000 81 Contig80 45024 196870144 82 Contig81 2508 197132288 83 Contig82 18516 197394432 84 Contig83 2887 197656576 85 Contig84 1903 197918720 86 Contig85 11708 198180864 87 Contig86 3578 198443008 88 Contig87 357916 198705152 89 Contig88 114971 199229440 90 Contig89 5793 199491584 91 Contig90 1286 199753728 92 Contig91 3055 200015872 93 Contig92 16296 200278016 94 Contig93 6014 200540160 95 Contig94 12162 200802304 96 Contig95 8480 201064448 97 Contig96 5112 201326592 98 Contig97 3444 201588736 99 Contig98 13493 201850880 100 Contig99 45819 202113024 101 Contig100 523 202375168 102 Contig101 28565 202637312 103 Contig102 3468 202899456 104 Contig103 12969 203161600 105 Contig104 12041 203423744 106 Contig105 9396 203685888 107 Contig106 3916 203948032 108 Contig107 5858 204210176 109 Contig108 12534 204472320 110 Contig109 13158 204734464 111 Contig110 10909 204996608 112 Contig111 11691 205258752 113 Contig112 22393 205520896 114 Contig113 3980 205783040 115 Contig114 312855 206045184 116 Contig115 7194 206569472 117 Contig116 2590 206831616 118 Contig117 6291 207093760 119 Contig118 7112 207355904 120 Contig119 6709 207618048 121 Contig120 16764 207880192 122 Contig121 9402 208142336 123 Contig122 5531 208404480 124 Contig123 2335 208666624 125 Contig124 4441 208928768 126 Contig125 6496 209190912 127 Contig126 1036 209453056 128 Contig127 72179 209715200 129 Contig128 223072 209977344 130 Contig129 3938 210239488 131 Contig130 4059 210501632 132 Contig131 53082 210763776 133 Contig132 4341 211025920 134 Contig133 3806 211288064 135 Contig134 8981 211550208 136 Contig135 1031 211812352 137 Contig136 1186 212074496 138 Contig137 3169 212336640 139 Contig138 4897 212598784 140 Contig139 2716 212860928 141 Contig140 30666 213123072 142 Contig141 17781 213385216 143 Contig142 27440 213647360 144 Contig143 33583 213909504 145 Contig144 11703 214171648 146 Contig145 3872 214433792 147 Contig146 39921 214695936 148 Contig147 869 214958080 149 Contig148 23673 215220224 150 Contig149 4065 215482368 151 Contig150 20782 215744512 152 Contig151 4631 216006656 153 Contig152 4316 216268800 154 Contig153 17715 216530944 155 Contig154 23346 216793088 156 Contig155 6222 217055232 157 Contig156 233009 217317376 158 Contig157 4947 217579520 159 Contig158 7561 217841664 160 Contig159 4309 218103808 161 Contig160 11984 218365952 162 Contig161 1782 218628096 163 Contig162 78174 218890240 164 Contig163 3784 219152384 165 Contig164 156932 219414528 166 Contig165 3680 219676672 167 Contig166 4336 219938816 168 Contig167 1977 220200960 169 Contig168 7179 220463104 170 Contig169 4993 220725248 171 Contig170 170613 220987392 172 Contig171 37773 221249536 173 Contig172 305720 221511680 174 Contig173 3137 222035968 175 Contig174 13327 222298112 176 Contig175 5931 222560256 177 Contig176 21589 222822400 178 Contig177 5325 223084544 179 Contig178 6159 223346688 180 Contig179 17036 223608832 181 Contig180 16855 223870976 182 Contig181 4724 224133120 183 Contig182 7914 224395264 184 Contig183 1157 224657408 185 Contig184 3871 224919552 186 Contig185 22644 225181696 187 Contig186 236109 225443840 188 Contig187 18573 225705984 189 Contig188 5590 225968128 190 Contig189 2928 226230272 191 Contig190 3490 226492416 192 Contig191 57839 226754560 193 Contig192 1344 227016704 194 Contig193 25979 227278848 195 Contig194 2287 227540992 196 Contig195 14255 227803136 197 Contig196 1207 228065280 198 Contig197 8071 228327424 199 Contig198 25840 228589568 200 Contig199 45732 228851712 201 Contig200 18488 229113856 202 Contig201 18515 229376000 203 Contig202 9407 229638144 204 Contig203 25789 229900288 205 Contig204 6552 230162432 206 Contig205 5585 230424576 207 Contig206 26641 230686720 208 Contig207 25591 230948864 209 Contig208 6344 231211008 210 Contig209 77327 231473152 211 Contig210 34205 231735296 212 Contig211 14750 231997440 213 Contig212 5360 232259584 214 Contig213 19870 232521728 215 Contig214 2008 232783872 216 Contig215 7487 233046016 217 Contig216 6226 233308160 218 Contig217 6237 233570304 219 Contig218 1522 233832448 --sjdbOverhang = 99 taken from the generated genome Started loading the genome: Sat Apr 3 18:54:35 2021 Genome: size given as a parameter = 234094592
`
The sum I get for column 3 matches my previous calculation (181467262 ).
How is STAR coming up with 234094592?
And how are the values in column 4 calculated?
Never mind. I see I'm confusing file size (bytes) with genome size.
Hi @krabapple
STAR pads the spaces between chromosomes, so total genome "size" in RAM is not the sum of chromosomes lengths.
Cheers Alex
I am trying to understand how STAR (v 2.7.6a ) calculates genome length. My genome consists of 219 contigs. I compute their total length to be 181467262 bp . STAR says this (from the Log.out file):
`
The sum I get for column 3 matches my previous calculation (181467262 ).
How is STAR coming up with 234094592?
And how are the values in column 4 calculated?