jceresearch pydit issues

jceresearch / pydit

Library of data wrangling functions that an internal auditor typically needs (for my own use and learning, if you wish to use or collaborate pls get in touch, or use at your own peril).

https://pypi.org/project/pydit-jceresearch/

MIT License

2 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Refactor

#61 jceresearch closed 1 week ago
0
Upgrade to Python 3.13

#60 jceresearch opened 2 weeks ago
0
47 check blanks should return summaries by default

#59 jceresearch closed 5 months ago
0
Check sequence has an error with date (object) columns

#58 jceresearch closed 10 months ago
1
Fuzzy merge using one or more columns in tandem plus a hardcode

#57 jceresearch opened 1 year ago
2
Add silent mode for various functions like cleanup column

#56 jceresearch closed 1 year ago
0
return the original dataframe when there is no duplicates

#55 jceresearch closed 1 year ago
1
Duplicates.py needs to add a log entry about non Nan duplicates

#54 jceresearch closed 1 year ago
0
Warning when sorting duplicates in duplicates.py

#53 jceresearch closed 1 year ago
0
Duplicates

#52 jceresearch closed 1 year ago
0
keyword search also generate log entry with the count of all items found not just specific columns

#51 jceresearch opened 2 years ago
0
business hours calculator, consider the default to be end of the current year, or +365 so we can do future/estimated calculations

#50 jceresearch opened 2 years ago
0
Improve documentation in the business hours

#49 jceresearch opened 2 years ago
0
Logging info/debug should go to stdout and not stderr to avoid red colour in jupyter,

#48 jceresearch closed 2 years ago
1
Check blanks should return summaries by default

#47 jceresearch closed 5 months ago
0
groupby_text_concatenate the key returned is text even if we supplied numeric, needs to preserve the original

#46 jceresearch closed 2 years ago
1
groupby_text_concatenate - needs an option to return unique values

#45 jceresearch closed 2 years ago
1
check_duplicates() expand the info returned in the logging

#44 jceresearch closed 2 years ago
1
check_duplicates() documentation doesnt include the "also return non duplicates"

#43 jceresearch closed 2 years ago
1
check_duplicates() returning also non duplicates and indicator=True should show which ones had a duplicate that had been dropped

#42 jceresearch closed 5 months ago
1
check_duplicates() indicator column sometimes is duplicates sometimes is duplicates_keep

#41 jceresearch closed 2 years ago
1
Add feature to add frequency counts based on another column or columns

#40 jceresearch closed 2 years ago
1
Add merge forcing suffixes (as per SO solution)

#39 jceresearch closed 2 years ago
1
Save also to Pickle, change name of parameter to save_to_pickle and not start with bool_

#38 jceresearch closed 2 years ago
0
Save xlsx needs to log what it is doing before initiating the saving as it can take ages to save

#37 jceresearch closed 2 years ago
1
Fillna smart needs more logging to show what it did

#36 jceresearch closed 2 years ago
1
Start logging - should also output an initial log entry

#35 jceresearch closed 2 years ago
1
: check_blanks() Refactor to have more performance for the summary

#34 jceresearch closed 2 years ago
0
coalesce_columns(): Check exactly the overwrite works when operation is None and document in the help

#33 jceresearch closed 2 years ago
1
check_sequence() refactor to do better input validation and error handling and simpler flow control

#32 jceresearch closed 2 years ago
0
add_percentile() research why we use here a different formula when grouping vs full population below

#31 jceresearch opened 2 years ago
0
Check_duplicates - Pending test for keep=False or keep="last" #Test of keeping last occurrence only

#30 jceresearch closed 2 years ago
1
Loading a dataframe from a pickle does not print the shape which could be useful

#29 jceresearch closed 2 years ago
1
coalesce_columns(): find an elegant way of stripping the trailing space when using concatenate option

#28 jceresearch closed 2 years ago
1
Add ability to provide the specific columns to fill in fillna_smart()

#27 jceresearch closed 2 years ago
0
check_duplicates() TBC if we want to strip the blanks as an option, before looking for dupes.

#26 jceresearch opened 2 years ago
0
add_counts: Add counts on itself, as a quick check of duplicates , TBC if useful

#25 jceresearch closed 2 years ago
0
coalesce_columns() add options for summing values, concatenate strings, or maybe max or other operation.

#24 jceresearch opened 2 years ago
0
Add_counts_in_each_row: add more checks for when not providing a DataFrame or no records

#23 jceresearch opened 2 years ago
0
Add_counts_in_each_row: add option for not overwriting the column but creating a new one

#22 jceresearch opened 2 years ago
0
Research whether returning None in singleton __init__ is the right approach, got some errors in ipython

#21 jceresearch closed 2 years ago
1
Implement some truncation in the clean_columns_name functionif the field is too long TBC how long

#20 jceresearch closed 2 years ago
0
Make the save to excel it work with a public method.

#19 jceresearch closed 2 years ago
0
The blanks total column is not working

#18 jceresearch closed 2 years ago
0
Add support for Series in the duplicate check

#17 jceresearch closed 2 years ago
1
Develop tests and check the fullrng.issubset(unique) approach is correct

#16 jceresearch closed 2 years ago
0
Look into pretty outputs for logging lists/tuples

#15 jceresearch closed 2 years ago
1
Check that the folders exist, see to that this check is done every time it is updated

#14 jceresearch closed 2 years ago
1
Add to the suite Benford and similar

#13 jceresearch closed 2 years ago
1
Add to the sequence check a consideration for non working days

#12 jceresearch closed 2 years ago
1