scikit-image / boilerplate-utils

some utility scripts used by scikit-image developers
BSD 3-Clause "New" or "Revised" License
4 stars 8 forks source link

Getting lists of dependent packages #3

Open grlee77 opened 4 years ago

grlee77 commented 4 years ago

It is useful to know who is using scikit-image when planning funding proposals. I was looking a little into how one can extract information such as that presented in GitHub's dependents view.

So, far it seems that it is possible to query the dependencies of scikit-image via an experimental API, but there is no public API for querying the dependent packages. You can browse it manually, but that is tedious given that there are > 1,000 packages in our case!

However, I found that with some modifications to the web scraping script from this stackoverflow post, we can extract this information into a list of packages along with the # of stars and forks for each dependencies.

We can then combine that with use of PyGitHub to retrieve "topics" associated with each of these packages, so that we can sort by number of stars and filter out to only those packages containing certain terms in the repository name or topic list (e.g. "brain, cell, mri, microscopy, etc.").

Running this script on scikit-image gave a list of 857 packages that depend on scikit-image and are active (i.e. are not represented by a "ghost" icon in the web interface). Of these:

The numbers above are for ALL application areas. I excluded packages with < 5 stars and then filtered to retain only those that have names/topics related to bioimaging, microscopy, medical imaging, etc. This results in a final list of

Topic Terms Used to Determine Biological Application Status bioimage_search_terms = [ 'airways', 'anatomy', 'arteries', 'astrocytes', 'atomic-force-microscopy', 'afm', 'axon', 'bioimage-informatics', 'bioinformatics', 'biologists', 'biomedical-image-processing', 'bionic-vision', 'biophysics', 'brain-connectivity', 'brain-imaging', 'brain-mri', 'brain-tumor-segmentation', 'brats', 'calcium', 'cancer-research', 'cell-biology', 'cell-detection', 'cell-segmentation', 'computational-pathology', 'connectome', 'connectomics', 'cryo-em', 'ct-data', 'deconvolution-microscopy', 'dicom', 'dicom-rt', 'digital-pathology-data', 'digital-pathology', 'digital-slide-archive', 'dmri', 'electron-microscopy', 'electrophysiology', 'fluorescence', 'fluorescence-microscopy-imaging', 'fmri', 'fmri-preprocessing', 'functional-connectomes', 'healthcare-imaging', 'histology', 'voxel', 'microorganism-colonies', 'microscopy', 'microscopy-images', 'neuroimaging', 'medical', 'medical-image-computing', 'medical-image-processing', 'medical-images', 'medical-imaging', 'mri', 'myelin', 'neural-engineering', 'neuroanatomy', 'neuroimaging', 'neuroimaging-analysis', 'neuropoly', 'neuroscience', 'nih-brain-initiative', 'openslide', 'pathology', 'pathology-image', 'radiation-oncology', 'radiation-physics', 'raman', 'retinal-implants', 'scanning-probe-microscopy', 'scanning-tunnelling-microscopy', 'single-cell-imaging', 'slide-images', 'spectroscopy', 'spinalcord', 'stm', 'stem', 'stitching', 'structural-connectomes', 'tissue-localization', 'tomography', 'volumetric-images', 'whole-slide-image', 'whole-slide-imaging', ]
Search terms in project name string reponame_terms = [ 'brain', 'cell', 'ecg', 'eeg', 'medi', 'mri', 'neuro', 'pathol', 'retin', 'slide', 'spectro', 'tissue', 'tomo',]

A detailed list of dependent biology-related packages with 5 or more stars is given in the table in next comment

Two caveats: 1.) The above list is probably a lower bound. There may be other packages that did not list any "topic" terms and did not use an obvious biology-related term in the project name. 2.) The above list is only downstream Packages. There are probably an order of magnitude more one-off repositories of individual users that are making use of scikit-image, but not packaging/distributing their code.

grlee77 commented 4 years ago

If there is interest, I can make a PR here with the script used to create the above. Otherwise I can just put it in a gist and link to it from here.

Given that it is based on web scraping assuming specific HTML element names and not an official API, I am sure it is pretty fragile and will likely break at some point in the future when GitHub redesigns their website. Ideally this info will be made available through the official API at some future point.

emmanuelle commented 4 years ago

Awesome! A PR in this repo would be great I think.

grlee77 commented 4 years ago

I exported the list above to a markdown table (DataFrames have a .to_markdown method in recent Pandas releases!)

name # of forks # of stars topics
Project-MONAI/MONAI 202 1200 ['healthcare-imaging', 'deep-learning', 'medical-image-computing', 'medical-image-processing', 'pytorch', 'python3']
DLTK/DLTK 357 1164 ['deep-learning', 'machine-learning', 'neural-networks', 'tensorflow', 'medical-imaging', 'data-science', 'ml', 'deep-neural-networks', 'python', 'medical', 'dltk', 'dltk-model-zoo', 'neural-network', 'neuroimaging', 'cnn', 'medical-image-processing']
Image-Py/imagepy 252 927 ['imagej', 'scikit-image', 'opencv', 'simpleitk']
CellProfiler/CellProfiler 239 480 []
poldracklab/fmriprep 194 321 ['fmri', 'fmri-preprocessing', 'brain-imaging', 'neuroimaging', 'bids', 'image-processing']
danforthcenter/plantcv 133 294 ['science', 'phenotyping', 'bioinformatics', 'plant']
delira-dev/delira 23 218 ['deep-learning', 'radiology', 'medical-imaging', 'machine-learning', 'pytorch', 'tensorflow', 'delira', 'medical-images']
AllenInstitute/AllenSDK 90 171 ['bioinformatics', 'scientific']
EtienneCmb/visbrain 34 151 ['vispy', 'gpu', 'neuroscience', 'visualization', 'gui', 'connectivity', 'opengl', 'mni', 'brain', 'sleep', 'plot', 'deep-sources', 'python']
DigitalSlideArchive/HistomicsTK 58 147 ['computer-vision', 'medical-image-processing', 'machine-learning', 'bioimage-informatics', 'histology', 'python', 'digital-slide-archive']
frankkramer-lab/MIScnn 33 128 ['deep-learning', 'convolutional-neural-networks', 'medical-image-processing', 'medical-image-segmentation', 'framework', 'computer-vision', 'clinical-decision-support', 'tensorflow', 'medical-image-analysis', 'medical-imaging', 'segmentation', 'neural-network', 'pip', 'healthcare-imaging']
poldracklab/mriqc 75 127 ['mri', 'quality-control', 'quality-reporter', 'machine-learning', 'neuroimaging']
MouseLand/cellpose 42 124 ['segmentation', 'cell-segmentation', 'cell-biology']
MouseLand/suite2p 94 120 ['imaging', 'neuroscience', 'data-analysis']
pycroscopy/pycroscopy 39 118 ['microscopy', 'imaging', 'spectroscopy', 'materials-science', 'scanning', 'visualization', 'atom', 'atomic-force-microscopy', 'scanning-probe-microscopy', 'afm', 'stm', 'electron-microscopy', 'stem', 'raman', 'infrared', 'statistics', 'scanning-tunnelling-microscopy', 'machine-learning', 'fft', 'signal-processing']
BrancoLab/BrainRender 25 91 ['neuroscience', 'python', 'anatomy']
QTIM-Lab/DeepNeuro 29 90 []
dPys/PyNets 30 85 ['workflow', 'nipype', 'dipy', 'nilearn', 'tractography', 'fmri', 'dmri', 'gaussian-graphical-models', 'ensemble-sampling', 'networks', 'graph-analysis', 'connectomics', 'brain-connectivity', 'structural-connectomes', 'gridsearch', 'functional-connectomes', 'networkx', 'optimization']
neuropoly/axondeepseg 16 78 ['deep-learning', 'machine-learning', 'myelin', 'axon', 'neuropoly', 'spinalcord', 'segmentation', 'electron-microscopy', 'microscopy', 'histology', 'convolutional-neural-networks']
jrkerns/pylinac 48 70 ['python', 'medical-physics', 'radiation-oncology']
notmatthancock/pylidc 27 69 ['lidc-dataset', 'dicom', 'ct-data', 'tcia-dac']
koriavinash1/DeepBrainSeg 18 60 ['biomedical-image-analysis', 'deep-learning', 'brats', 'ensemble-learning', 'deep-convolutional-neural-networks', 'brain-tumor-segmentation', 'segmentation', 'artificial-intelligence', 'medical-image-analysis']
dicompyler/dicompyler-core 32 58 ['dicom', 'dicom-rt', 'python', 'radiation-oncology', 'radiation-physics', 'dvh']
SainsburyWellcomeCentre/cellfinder 14 52 ['microscopy', 'cell-detection', 'registration', 'deep-learning', 'image-analysis', 'neuroscience', 'python', 'neuroanatomy', 'imaging']
LiberTEM/LiberTEM 44 47 ['electron-microscopy', 'data-processing', 'image-processing', 'python']
bioimagesuiteweb/bisweb 16 45 ['webassembly', 'neuroimaging-analysis', 'medical', 'image-processing', 'nifti', 'viewer', 'fmri', 'nih-brain-initiative']
neurodata/m2g 24 40 []
pulse2percept/pulse2percept 20 40 ['python', 'neuroscience', 'vision', 'neural-engineering', 'bionic-vision', 'retinal-implants']
aramis-lab/clinica 16 35 ['neuroimaging', 'python', 'machine-learning', 'brainweb', 'bids-format', 'ants', 'freesurfer', 'fsl', 'mrtrix3', 'spm', 'scikit-learn']
AllenCellModeling/aicsimageio 1 33 ['python', 'imageio', 'microscopy', 'image-metadata', 'scientific-computing', 'scientific-formats']
baccuslab/pyret 7 33 ['python', 'scientific-computing', 'tools', 'neuroscience', 'electrophysiology']
iitzco/deepbrain 17 31 ['deep-learning', 'medical-imaging', 'artificial-intelligence', 'brain-mri', 'tensorflow', 'neural-networks', 'machine-learning']
PingjunChen/tissueloc 6 26 ['whole-slide-imaging', 'pathology', 'computational-pathology', 'pathology-image', 'tissue-localization']
sakoho81/miplib 7 26 ['image-processing', 'image-restoration', 'image-resolution', 'image-analysis', 'microscopy-images', 'deconvolution-microscopy', 'fourier-analysis']
RivuletStudio/rivuletpy 6 26 ['neuron', 'neuroinformatics', 'morphology', 'image-processing', 'rivulet2-algorithm', 'python', 'tracing', 'curvilinear-coordinates', 'medical-imaging', 'medical-image-processing', 'airways', 'arteries', 'centerline']
jgamper/compay-syntax 0 25 ['computational-pathology', 'medical-imaging', 'wsis', 'openslide', 'slide-images', 'pathology-image', 'whole-slide-image', 'whole-slide-imaging']
neuro-ml/deep_pipe 6 22 []
PennBBL/qsiprep 11 20 ['python', 'diffusion-mri', 'tractography', 'connectomics', 'pipelines']
NeurodataWithoutBorders/nwb-jupyter-widgets 12 19 []
david-hoffman/pyOTF 9 19 ['microscopy', 'numpy']
flika-org/flika 1 19 ['biologists', 'python', 'imagej', 'calcium', 'image-processing', 'fluorescence', 'microscopy', 'neurons', 'astrocytes']
haranrk/DigiPathAI 4 18 ['deep-learning', 'cancer-research', 'segmentation', 'gui', 'medical-image-analysis']
PingjunChen/pyslide 6 18 ['pathology', 'whole-slide-imaging', 'deep-learning', 'python', 'image-analysis']
pycroscopy/pyUSID 6 17 ['hdf5', 'imaging', 'spectroscopy', 'data', 'parallel-computing']
mckib2/pygrappa 6 17 ['mri', 'grappa', 'parallel-imaging', 'python', 'image-reconstruction', 'sms', 'grog', 'sense']
histolab/histolab 2 17 ['digital-pathology-data', 'digital-pathology', 'bioinformatics', 'biology']
nipy/dmriprep 8 16 []
SainsburyWellcomeCentre/amap-python 7 16 ['microscopy', 'registration', 'image-analysis', 'neuroscience', 'python', 'neuroanatomy', 'imaging']
vsoch/pybraincompare 9 15 ['neuro', 'brain', 'comparison', 'd3']
imr-framework/virtual-scanner 6 15 ['mri']
scholi/pySPM 13 15 ['python', 'spm', 'afm', 'stm', 'bruker', 'iontof', 'tof-sims', 'nanoscan', 'sfm', 'nanonis', 'sxm', 'image-processing', 'python-library', 'nanonis-sxm', 'principal-component-analysis']
neurodata/ndreg 5 15 []
HumanBrainProject/neuroglancer-scripts 12 14 ['neuroimaging']
alexblaessle/PyFRAP 4 14 ['frap', 'fluorescence-microscopy-imaging', 'python', 'gui', 'frap-analysis', 'frap-experiment', 'pde', 'data-fitting']
lens-biophotonics/ZetaStitcher 2 14 ['stitching', 'microscopy', 'microscopy-images', 'fluorescence-microscopy-imaging', 'volumetric-images']
aschampion/diluvian 15 14 ['connectomics', 'keras-neural-networks']
stefsmeets/instamatic 7 13 ['electron-microscopy', 'electron-diffraction', 'serial-crystallography', '3d-electron-diffraction', 'micro-ed', 'data-collection', 'automation']
RI-imaging/ODTbrain 4 12 ['diffraction-tomography', 'backpropagation', 'single-cell-imaging', 'biophysics']
kushalkolar/MESmerize 8 12 ['calcium-imaging', 'neuroscience', 'interactive-visualizations', 'plotting', 'snap', 'signal-processing']
timsainb/birdbrain 5 12 ['birdsong', 'neuroscience', 'atlas']
CEA-COSMIC/pysap-mri 10 11 []
Karol-G/Gcam 1 11 ['pytorch', 'gradcam', 'gradcam-plus-plus', 'grad-cam', 'guided-backpropagation', 'guidedgradcam', 'visualization', 'saliency', 'cnn-visualization', 'guided-grad-cam', 'segmentation', '2d', '3d', 'medical-imaging', 'gradient-visualization', 'gcam']
SainsburyWellcomeCentre/neuro 4 10 ['python', 'neuroscience', 'neuroanatomy', 'image-analysis', 'microscopy', 'registration', 'visualisation', 'cell-detection']
neurodata/ardent 6 10 []
CellProfiler/centrosome 25 9 []
zhuangjun1981/retinotopic_mapping 10 9 []
imaging-tools/ivvv 3 9 ['voxel', 'volume-rendering']
ziatdinovmax/atomai 0 9 ['microscopy', 'materials-science', 'machine-learning', 'deep-learning', 'atom-resolved-data', 'atom-finding', 'colab', 'defects']
ziatdinovmax/GPim 2 9 ['image-processing', 'hyperspectral-images', 'gaussian-processes', 'bayesian-optimization', 'microscopy', 'colab-notebook', 'lattice-models']
Erik-White/ColonyScanalyser 2 8 ['python', 'imaging', 'image-analysis', 'scikit-image', 'biology', 'microorganism-colonies', 'computer-vision']
ELEKTRONN/ELEKTRONN2 8 7 ['theano', '3d-convolutional-network', 'convolutional-neural-networks', '3d-cnn', 'cnn', 'biomedical-image-processing', 'electron-microscopy']
jeffkinnison/florin 1 6 ['computer-vision', 'parallel-computing', 'distributed-computing', 'neuroscience', 'iris-biometrics']
TariqAHassan/BioVida 1 6 ['machine-learning', 'biomedical-informatics', 'data-science', 'bioinformatics', 'imaging-informatics']
PrincetonUniversity/ASPIRE-Python 0 6 ['data-science', 'mathematics', 'machine-learning', 'cryo-em', 'bioinformatics', 'classification-algorithm', 'aspire', 'python']