neosr-project/neosr - Githubissues

Join our Discord

neosr is an open-source framework for training super-resolution models. It provides a comprehensive and reproducible environment for achieving state-of-the-art image restoration results, making it suitable for both the enthusiastic community, professionals and machine learning academic researchers. It serves as a versatile platform and aims to bridge the gap between practical application and academic research in the field.

Accessible: implements a wide range of the latest advancements in single-image super-resolution networks, losses, optimizers and augmentations. Users can easily explore, adapt and experiment with various configurations for their specific needs, even without coding skills.
Efficient: optimized for faster training iterations, quicker convergence and low GPU requirements, making it the most efficient choice for both research and practical use cases.
Practical: focuses on the real-world use of super-resolution to realistically restore degraded images in various domains, including photos, anime/cartoons, illustrations and more. It's also suitable for critical applications like medical imaging, forensics, geospatial and others (although caution should be taken in those cases).
Reproducible: this framework emphasizes the importance of reproducible research. It provides deterministic training environments that can create bit-exact reproducible models (on the same platform), ensuring predictable and reliable results, which are essential for maintaining consistency in academic validation.
Simple: features are easy to implement or modify. Code is written in readable Python, no fancy styling. All code is validated and formatted by ruff, mypy and torchfix.

For more information see our wiki.

🤝 support the project

[!TIP] Consider supporting the project on KoFi ☕ or Patreon

💻 installation

Requires Python 3.12 and CUDA >=12.4. Clone the repository and install via poetry:

git clone https://github.com/neosr-project/neosr
cd neosr
poetry install --sync

See detailed Installation Instructions for more details.

⏩ quick start

Start training by running:

python train.py -opt options.toml

Where options.toml is a configuration file. Templates can be found in options.

[!TIP] Please read the wiki Configuration Walkthrough for an explanation of each option.

✨ features

supported archs:

arch	option
Real-ESRGAN	`esrgan`
SRVGGNetCompact	`compact`
SwinIR	`swinir_small`, `swinir_medium`
HAT	`hat_s`, `hat_m`, `hat_l`
OmniSR	`omnisr`
SRFormer	`srformer_light`, `srformer_medium`
DAT	`dat_small`, `dat_medium`, `dat_2`
DITN	`ditn`
DCTLSA	`dctlsa`
SPAN	`span`
Real-CUGAN	`cugan`
CRAFT	`craft`
SAFMN	`safmn`, `safmn_l`
RGT	`rgt`, `rgt_s`
ATD	`atd`, `atd_light`
PLKSR	`plksr`, `plksr_tiny`
RealPLKSR	`realplksr`, `realplksr_s`
DRCT	`drct`, `drct_l`, `drct_s`
MSDAN	`msdan`
SPANPlus	`spanplus`, `spanplus_sts`, `spanplus_s`, `spanplus_st`
HiT-SRF	`hit_srf`, `hit_srf_medium`, `hit_srf_large`
HMA	`hma`, `hma_medium`, `hma_large`
MAN	`man`, `man_tiny`, `man_light`
light-SAFMN++	`light_safmnpp`
MoSR	`mosr`, `mosr_t`
GRFormer	`grformer`, `grformer_medium`, `grformer_large`
EIMN	`eimn`, `eimn_a`, `eimn_l`

[!NOTE] For all arch-specific parameters, read the wiki.

under testing

arch	option
Swin2-MoSE	`swin2mose`
LMLT	`lmlt`, `lmlt_tiny`, `lmlt_large`
DCT	`dct`
FIWHN	`fiwhn`
KRGN	`krgn`
PlainUSR	`plainusr`, `plainusr_ultra`, `plainusr_large`
HASN	`hasn`
FlexNet	`flexnet`, `metaflexnet`
CFSR	`cfsr`

supported discriminators:

net	option
U-Net w/ SN	`unet`
PatchGAN w/ SN	`patchgan`
EA2FPN (bespoke, based on A2-FPN)	`ea2fpn`
DUnet	`dunet`

supported optimizers:

optimizer	option
Adam	`Adam` or `adam`
AdamW	`AdamW` or `adamw`
NAdam	`NAdam` or `nadam`
Adan	`Adan` or `adan`
AdamW Win2	`AdamW_Win` or `adamw_win`
ECO strategy	`eco`, `eco_iters`
AdamW Schedule-Free	`adamw_sf`
Adan Schedule-Free	`adan_sf`
F-SAM	`fsam`, `FSAM`

supported losses:

loss	option
L1 Loss	`L1Loss`, `l1_loss`
L2 Loss	`MSELoss`, `mse_loss`
Huber Loss	`HuberLoss`, `huber_loss`
CHC (Clipped Huber with Cosine Similarity Loss)	`chc_loss`
NCC (Normalized Cross-Correlation)	`ncc_opt`, `ncc_loss`
Perceptual Loss	`perceptual_opt`, `vgg_perceptual_loss`
GAN	`gan_opt`, `gan_loss`
MS-SSIM	`mssim_opt` `mssim_loss`
LDL Loss	`ldl_opt`, `ldl_loss`
Focal Frequency	`ff_opt`, `ff_loss`
DISTS	`dists_opt`, `dists_loss`
Wavelet Guided	`wavelet_guided`
Gradient-Weighted	`gw_opt`, `gw_loss`
Perceptual Patch Loss	`perceptual_opt`, `patchloss`, `ipk`
Consistency Loss (Oklab and CIE L*)	`consistency_opt`, `consistency_loss`
KL Divergence	`kl_opt`, `kl_loss`
MS-SWD	`msswd_opt`, `msswd_loss`
FDL	`fdl_opt`, `fdl_loss`

supported augmentations:

augmentation	option
Rotation	`use_rot`
Flip	`use_hflip`
MixUp	`mixup`
CutMix	`cutmix`
ResizeMix	`resizemix`
CutBlur	`cutblur`

supported models:

model	description	option
Image	Base model for SISR, supports both Generator and Discriminator	`image`
OTF	Builds on top of `image`, adding Real-ESRGAN on-the-fly degradations	`otf`

supported dataloaders:

loader	option
Paired datasets	`paired`
Single datasets (for inference, no GT required)	`single`
Real-ESRGAN on-the-fly degradation	`otf`

📸 datasets

As part of neosr, I have released a dataset series called Nomos. The purpose of these datasets is to distill only the best images from the academic and community datasets. A total of 14 datasets were manually reviewed and processed, including: Adobe-MIT-5k, RAISE, LSDIR, LIU4k-v2, KONIQ-10k, Nikon LL RAW, DIV8k, FFHQ, Flickr2k, ModernAnimation1080_v2, Rawsamples, SignatureEdits, Hasselblad raw samples and Unsplash.

Nomos-v2 (recommended): contains 6000 images, multipurpose. Data distribution:

pie
  title Nomos-v2 distribution
  "Animal / fur" : 439
  "Interiors" : 280
  "Exteriors / misc" : 696
  "Architecture / geometric" : 1470
  "Drawing / painting / anime" : 1076
  "Humans" : 598
  "Mountain / Rocks" : 317
  "Text" : 102
  "Textures" : 439
  "Vegetation" : 574

nomos_uni (recommended for lightweight networks): contains 2989 images, multipurpose. Meant to be used on lightweight networks (<800k parameters).
hfa2k: contains 2568 anime images.

dataset download	sha256
nomosv2 (3GB)	sha256
nomosv2.lmdb (3GB)	sha256
nomosv2_lq_4x (187MB)	sha256
nomosv2_lq_4x.lmdb (187MB)	sha256
nomos_uni (1.3GB)	sha256
nomos_uni.lmdb (1.3GB)	sha256
nomos_uni_lq_4x	sha256
nomos_uni_lq_4x.lmdb	sha256
hfa2k	sha256

community datasets

Datasets made by the upscaling community. More info can be found in author's repository.

4xNomosRealWeb Dataset: realistically degraded LQ's for Nomos-v2 dataset (from @Phhofm).
FaceUp: Curated version of FFHQ
SSDIR: Curated version of LSDIR.
ArtFaces: Curated version of MetFaces.
Nature Dataset: Curated version of iNaturalist.
digital_art_v2: Digital art dataset from @umzi2.

dataset	download
@Phhofm 4xNomosRealWeb	Release page
@Phhofm FaceUp	GDrive (4GB)
@Phhofm SSDIR	Gdrive (4.5GB)
@Phhofm ArtFaces	Release page
@Phhofm Nature Dataset	Release page
@umzi2 Digital Art (v2)	Release page

📖 resources

📄 license and acknowledgements

Released under the Apache license. All licenses listed on license/readme. This code was originally based on BasicSR.

Thanks to victorca25/traiNNer, styler00dollar/Colab-traiNNer and timm for providing helpful insights into some problems.

Thanks to active contributors @Phhofm, @Sirosky, and @umzi2 for helping with tests and bug reporting.