Open dkoslicki opened 2 years ago
@dkoslicki, I'm sorry that this issue makes you confused. The reason why K17348 is mssing in the data dump is that the organisms associated with this KO are not one of 'Archaea', 'Bacteria', or 'Fungi'. When I downloaded gene data before, I only considers these three categories. If you look at the gene
section on the website again, you will find that all genes are from the organisms vertebrates
, mollusks
or flatworms
(Please also see the list below).
array([['hsa', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ptr', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pps', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ggo', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pon', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['nle', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mcc', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mcf', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['csab', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['caty', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['panu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['rro', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['rbb', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['tfn', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pteh', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['cjc', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['sbq', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['csyr', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mmur', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['oga', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mmu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mcal', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mpah', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['rno', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mcoc', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mun', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['cge', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pleu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ngi', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['hgl', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['cpoc', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ccan', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['dord', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['dsp', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ocu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['opi', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['tup', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['cfa', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['vvp', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['vlg', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['aml', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['umr', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['uah', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['uar', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['oro', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['elk', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mpuf', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['eju', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['zca', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mlx', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['fca', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pyu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pbg', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ptg', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ppad', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['aju', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['hhv', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['bta', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['bom', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['biu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['bbub', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['chx', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['oas', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['oda', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ccad', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ssc', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['cfr', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['cbai', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['cdk', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['bacu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['lve', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['oor', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['dle', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pcad', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['psiu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ecb', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['epz', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['eai', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['myb', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['myd', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mmyo', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mlf', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mna', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pkl', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['hai', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['dro', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['shon', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ajm', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pdic', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['phas', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mmf', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['rfq', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pale', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pgig', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pvp', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['ray', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mjv', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['tod', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['sara', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['lav', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['tmu', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['dnm', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['mdo', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['shr', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['pcw', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['oaa', 'Eukaryotes;Animals;Vertebrates;Mammals'],
['gga', 'Eukaryotes;Animals;Vertebrates;Birds'],
['pcoc', 'Eukaryotes;Animals;Vertebrates;Birds'],
['mgp', 'Eukaryotes;Animals;Vertebrates;Birds'],
['cjo', 'Eukaryotes;Animals;Vertebrates;Birds'],
['nmel', 'Eukaryotes;Animals;Vertebrates;Birds'],
['apla', 'Eukaryotes;Animals;Vertebrates;Birds'],
['acyg', 'Eukaryotes;Animals;Vertebrates;Birds'],
['aful', 'Eukaryotes;Animals;Vertebrates;Birds'],
['tgu', 'Eukaryotes;Animals;Vertebrates;Birds'],
['lsr', 'Eukaryotes;Animals;Vertebrates;Birds'],
['scan', 'Eukaryotes;Animals;Vertebrates;Birds'],
['pmoa', 'Eukaryotes;Animals;Vertebrates;Birds'],
['otc', 'Eukaryotes;Animals;Vertebrates;Birds'],
['pruf', 'Eukaryotes;Animals;Vertebrates;Birds'],
['gfr', 'Eukaryotes;Animals;Vertebrates;Birds'],
['fab', 'Eukaryotes;Animals;Vertebrates;Birds'],
['phi', 'Eukaryotes;Animals;Vertebrates;Birds'],
['pmaj', 'Eukaryotes;Animals;Vertebrates;Birds'],
['ccae', 'Eukaryotes;Animals;Vertebrates;Birds'],
['ccw', 'Eukaryotes;Animals;Vertebrates;Birds'],
['etl', 'Eukaryotes;Animals;Vertebrates;Birds'],
['zab', 'Eukaryotes;Animals;Vertebrates;Birds'],
['fpg', 'Eukaryotes;Animals;Vertebrates;Birds'],
['fch', 'Eukaryotes;Animals;Vertebrates;Birds'],
['clv', 'Eukaryotes;Animals;Vertebrates;Birds'],
['egz', 'Eukaryotes;Animals;Vertebrates;Birds'],
['nni', 'Eukaryotes;Animals;Vertebrates;Birds'],
['acun', 'Eukaryotes;Animals;Vertebrates;Birds'],
['tala', 'Eukaryotes;Animals;Vertebrates;Birds'],
['padl', 'Eukaryotes;Animals;Vertebrates;Birds'],
['achc', 'Eukaryotes;Animals;Vertebrates;Birds'],
['aam', 'Eukaryotes;Animals;Vertebrates;Birds'],
['arow', 'Eukaryotes;Animals;Vertebrates;Birds'],
['npd', 'Eukaryotes;Animals;Vertebrates;Birds'],
['dne', 'Eukaryotes;Animals;Vertebrates;Birds'],
['asn', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['amj', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['cpoo', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['ggn', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['pss', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['cmy', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['cpic', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['tst', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['cabi', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['mrv', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['acs', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['pvt', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['sund', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['pbi', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['pmur', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['tsr', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['pgut', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['vko', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['pmua', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['zvi', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['gja', 'Eukaryotes;Animals;Vertebrates;Reptiles'],
['xla', 'Eukaryotes;Animals;Vertebrates;Amphibians'],
['xtr', 'Eukaryotes;Animals;Vertebrates;Amphibians'],
['npr', 'Eukaryotes;Animals;Vertebrates;Amphibians'],
['rtem', 'Eukaryotes;Animals;Vertebrates;Amphibians'],
['bbuf', 'Eukaryotes;Animals;Vertebrates;Amphibians'],
['dre', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['srx', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['sanh', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['sgh', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ccar', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['caua', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ipu', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['phyp', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['smeo', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['amex', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['eee', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['tru', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['lco', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ncc', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['cgob', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ely', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['plep', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['sluc', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ecra', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['pflv', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['gat', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ppug', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['msam', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['cud', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['mze', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['onl', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['oau', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ola', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['oml', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['xma', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['xco', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['xhe', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['pret', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['pfor', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['plai', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['pmei', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['gaf', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['cvg', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ctul', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['nfu', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['kmr', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['alim', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['aoce', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['csem', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['pov', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ssen', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['hhip', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['lcf', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['sdu', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['slal', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['xgl', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['hcq', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['bpec', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['malb', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['sasa', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['otw', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['omy', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['ogo', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['one', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['salp', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['snh', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['els', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['sfm', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['pki', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['aang', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['loc', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['pspa', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['arut', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['lcm', 'Eukaryotes;Animals;Vertebrates;Fishes'],
['cmk', 'Eukaryotes;Animals;Vertebrates;Cartilaginous fishes'],
['rtp', 'Eukaryotes;Animals;Vertebrates;Cartilaginous fishes'],
['sclv', 'Eukaryotes;Animals;Ascidians'],
['ccin', 'Eukaryotes;Animals;Arthropods;Insects'],
['otu', 'Eukaryotes;Animals;Arthropods;Insects'],
['dsv', 'Eukaryotes;Animals;Arthropods;Chelicerates'],
['rsan', 'Eukaryotes;Animals;Arthropods;Chelicerates'],
['rmp', 'Eukaryotes;Animals;Arthropods;Chelicerates'],
['tut', 'Eukaryotes;Animals;Arthropods;Chelicerates'],
['pcan', 'Eukaryotes;Animals;Mollusks'],
['bgt', 'Eukaryotes;Animals;Mollusks'],
['hrf', 'Eukaryotes;Animals;Mollusks'],
['crg', 'Eukaryotes;Animals;Mollusks'],
['egl', 'Eukaryotes;Animals;Flatworms']], dtype=object)
Ah, that makes sense. so @raquellewei can remove those KOs from the hierarchy that don't show up in kegg_genes.faa
, correct?
Correct, I can give @raquellewei such KO list in which each one is associated with at least one organism that belongs to either of 'Archaea', 'Bacteria', or 'Fungi'. Or, she can directly remove KOs that don't show up in kegg_genes.faa
.
@raquellewei, please let me know which way you prefer.
Go ahead and remove the KOs that don't show up in kegg_genes.faa
(i.e. don't include Eukaryotes and the like the the edge list)
It appears there are KOs that exist in KEGG that aren't being scraped. For example:
K17348
. This shows up on their website here, as well as in the hierarchy:But even though the website shows associated protein sequences, these are not in the data dump:
returns nothing.