Dev sprint ideas: More tests, type hints and less complexity

itsvinayak commented 4 years ago

currently, some of the programs use static type checking like this program but some of the programs did not use static typing.

it's a good practice to use static typing as it makes code more clear and readable, should we make it a standard for this repository.we can use mypy for testing code

Dev sprint ideas:

[ ] Add tests to Python files with <10% test coverage.
[ ] Add static typing to functions and methods.
[ ] Set flake8 --max-complexity=15 (Ensure files have strong tests before refactoring). Test results from #2139...
- [ ] ./boolean_algebra/quine_mc_cluskey.py:82:1: C901 'selection' is too complex (17)
- [ ] ./digital_image_processing/edge_detection/canny.py:20:1: C901 'canny' is too complex (17) @lighttxu
- [ ] ./graphs/minimum_spanning_tree_prims.py:5:1: C901 'PrimsAlgorithm' is too complex (21)
  - [ ] Add doctests aligned with https://en.wikipedia.org/wiki/Prim%27s_algorithm
  - [ ] In a separate PR reduce the McCabe complexity
- [ ] ./linear_algebra/src/polynom-for-points.py:1:1: C901 'points_to_polynomial' is too complex (23) @nic-dern
- [ ] ./machine_learning/linear_discriminant_analysis.py:251:1: C901 'main' is too complex (25)
- [x] ./hashes/hamming_code.py:71:1: C901 'emitterConverter' is too complex (16) #2140
- [x] ./hashes/hamming_code.py:153:1: C901 'receptorConverter' is too complex (20) #2140
- [x] ./project_euler/problem_551/sol1.py:20:1: C901 'next_term' is too complex (16) #2141

cclauss commented 4 years ago

We push all new contributions to use type hints as discussed in CONTRIBUTING.md. All efforts to add type hints to existing algorithms would be warmly received.

We can use mypy for testing code

We do this already https://github.com/TheAlgorithms/Python/blob/master/.travis.yml#L17

onlinejudge95 commented 4 years ago

@cclauss How about a dev sprint kind of thing where we go about adding all such improvements?

cclauss commented 4 years ago

Cool idea! Another thing (a sprint topic?) that is bugging me is code complexity which we currently set to 25 but I would be much happier to see it at 15. We would need to ensure that the files have strong type hints and tests before modifying them to reduce their cyclomatic complexity.

cclauss commented 4 years ago

Another cool sprint topic would be to add doctests to Python files that have <10% test coverage. Some files like file_transfer/send_file.py and the web programming files are difficult to write tests for but others should be fair game.

cclauss commented 4 years ago

Should we have a short sprint or a long one? One idea would be 24 hour sprint — given that tomorrow is summer solstice (longest day of the year) — the sprint could start at midnight tonight (in whatever timezone the contributer is in) and last for 24 hours. #2128 could be our tracking issue for keeping track of tasks and accomplishments. Thoughts on this Summer Solstice Special Sprint idea.

onlinejudge95 commented 4 years ago

Seems awesome, we can gain some attention by Summer Solstice Special Sprint, I am up on gitter if you want to discuss

l3str4nge commented 4 years ago

Another cool sprint topic would be to add doctests to Python files that have <10% test coverage. Some files like file_transfer/send_file.py and the web programming files are difficult to write tests for but others should be fair game.

Difficult but not impossible. We can create issue with some labels. Perhaps we find someone who will write tests or even code (for example simple server for response) for testing it :smiley:

spamegg1 commented 4 years ago

I have an idea. How about making the code more idiomatic and Pythonic?

For example, currently I see a lot of code of the form:

for i in range(len(some_iterable)):
    ...
    do something with i or some_iterable[i]

It is better to do:

for index, item in enumerate(some_iterable):
    ...
    do something with item or index

or if it is a dictionary:

for key, value in some_iterable.items():
    do something with key or value

This is just one example. There are also many pieces of code where comprehension can replace for-loops.

There are many other ways to write more Pythonic code. Here are some excellent videos from one of the experts of the subject: Transforming Code into Beautiful, Idiomatic Python Beyond PEP8 Best practices And some excellent articles: Code like a Pythonista: Idiomatic Python Idiomatic Python. Coding the smart way

I understand this would be a huge undertaking, but it's just an idea that pays off greatly in the long run. We can all become better Python coders!

cclauss commented 4 years ago

I understand this would be a huge undertaking

Incrementalism wins. Please don't try to do this across many files at once. Instead, please find a file that needs improvement and submit a pull request to improve just that one file. Please create other similar PRs for other individual files. We can progress file-by-file in this manner.

spamegg1 commented 4 years ago

@cclauss Got it.

Feliren88 commented 4 years ago

Hey, keen to contribute. I am first timer in open source

ronnydw commented 3 years ago

Concerning complexity and maintainability, I've run the repo through wily (https://github.com/tonybaloney/wily), a command-line application for tracking, reporting on the complexity of Python tests and applications.

Here are the results for maintainability index [0..100] per module. The scale considers anything lower than 25 as hard to maintain, and anything over 75 as easy to maintain.

│ File                                                                    │   Maintainability Index │
│ hashes                                                                  │                 60.1865 │
│ neural_network                                                          │                 67.4381 │
│ ciphers                                                                 │                 67.8471 │
│ graphs                                                                  │                 68.9013 │
│ bit_manipulation                                                        │                 69.5657 │
│ linear_algebra                                                          │                 69.6743 │
│ searches                                                                │                 70.5201 │
│ backtracking                                                            │                 71.0636 │
│ divide_and_conquer                                                      │                 71.504  │
│ blockchain                                                              │                 71.9272 │
│ data_structures                                                         │                 71.9902 │
│ matrix                                                                  │                 72.5374 │
│ scripts                                                                 │                 73.9013 │
│ strings                                                                 │                 74.1231 │
│ machine_learning                                                        │                 75.3904 │
│ sorts                                                                   │                 75.6606 │
│ maths                                                                   │                 75.6964 │
│ other                                                                   │                 75.6972 │
│ traversals                                                              │                 76.3276 │
│ geodesy                                                                 │                 76.5739 │
│ dynamic_programming                                                     │                 76.6071 │
│ conversions                                                             │                 77.2626 │
│ boolean_algebra                                                         │                 78.4325 │
│ scheduling                                                              │                 78.8933 │
│ greedy_method                                                           │                 79.298  │
│ compression                                                             │                 80.0726 │
│ networking_flow                                                         │                 81.1551 │
│ arithmetic_analysis                                                     │                 83.0291 │
│ genetic_algorithm                                                       │                 83.2064 │
│ project_euler                                                           │                 83.7122 │
│ web_programming                                                         │                 84.3123 │
│ graphics                                                                │                 84.6733 │
│ computer_vision                                                         │                 84.8111 │
│ digital_image_processing                                                │                 86.2331 │
│ cellular_automata                                                       │                 87.3411 │
│ fuzzy_logic                                                             │                 87.7142 │
│ file_transfer                                                           │                 96.1178 │
│ quantum                                                                 │                100      │
│ images                                                                  │                100      │
│ Total                                                                   │                 77.2957 │

This allows us to track the overall maintainability of the repo (=77.3) and focus improvements on the less maintainable files. For new files, a threshold of 75 could be set for the maintainability index.

cclauss commented 3 years ago

Please pick one directory (and ONLY one directory) [like hashes] and create a PR that demonstrates how we can improve.

TheAlgorithms / Python

Dev sprint ideas: More tests, type hints and less complexity #2128

Dev sprint ideas: