TorchDSP / torchsig

TorchSig is an open-source signal processing machine learning toolkit based on the PyTorch data handling pipeline.
MIT License
170 stars 38 forks source link

Initial draft of restructuring transforms #106

Closed lboegner closed 1 year ago

lboegner commented 1 year ago

Summary

Initial draft of restructuring the transforms as mentioned in issue #93 .

Here, we collapse all the transforms and functional code into transforms.py, target_transforms.py, and functional.py files. This collapsing mirrors the structure of the code accepted by the CV and audio ML communities used in both torchvision and torchaudio. Consolidating the code simplifies shared uses across the previously split out categories of transforms, particularly with the functional code. Conversely, the transforms.py file is now quite large. We could split these classes out to have their own dedicated files, but prior to doing so, we may want to continue cleaning up their code as many of them do too much in their __call__ methods that should be moved to functional.py. The transforms may also benefit from breaking out the SignalDescription updates into dedicated methods.

In addition to the consolidation, any previously broken example code within the transform docstrings (as highlighted in #83) was updated to be valid; however, we may still benefit from adding more example code throughout.

Finally, __repr__ methods were added to all transforms to make our transform classes' string/print outputs much more meaningful.

Tests

flake8 test:

flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics

Output:

0

Local tests:

pytest -v

Output:

...
=========================== 10 passed, 1 warning in 9.95s ===============================
gvanhoy commented 1 year ago

Looks like a lot of work my friend -- but definitely the direction things need to go. One big file is totally fine if it's structure is highly predictable and consistent. ctrl+f for the win. I'm already annoyed with the 100+ import statements coming from torchsig (that I myself initially caused), so this is a return to a more normal structure. Only reason to keep the old structure would be if people wanted to import only parts of the transforms, but that's not a use-case that's likely to be common or that we need to support.

gvanhoy commented 1 year ago

As mentioned here , we should be able to automate the tests in docstrings. Maybe not fully necessary to work for minor releases, but major releases should work.

I also agree, the transforms having huge __call__ is an indication that there's benefit in considering refactoring. I remember talking about how to go about this, but I'm not sure at the moment what's the best way.

gvanhoy commented 1 year ago

Well, at least if we want to automate doctests, we should not automate the ones that generate an entire dataset like WidebandSig53. With these changes, all the doctests work again, which is awesome.