pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.62k stars 17.91k forks source link

TST: split out sparse tests #18969

Closed jreback closed 5 years ago

jreback commented 6 years ago

https://github.com/pandas-dev/pandas/pull/18968

sets up a really basic structure. ideally split out tests from tests_series and test_frame, in a similar manner to how we do pandas/tests/frame and pandas/tests/series.

Archer6621 commented 5 years ago

@jreback Is this still relevant? If so, I'd like to help with it!

jreback commented 5 years ago

sure spitting of big test modules is always welcome

Archer6621 commented 5 years ago

@jreback I've made an attempt at categorizing the tests for the sparse series only so far:

Tests in sparse.test_series.py

Test                                                                Category

TestSparseSeries
    test_constructor_dict_input                                     constructors
    test_constructor_dict_order                                     constructors
    test_constructor_dtype                                          constructors
    test_iteration_and_str                                          sparse
    test_construct_DataFrame_with_sp_series                         constructors
    test_constructor_preserve_attr                                  constructors
    test_series_density                                             sparse
    test_sparse_to_dense                                            sparse
    test_to_dense_fill_value                                        missing
    test_dense_to_sparse                                            sparse
    test_to_dense_preserve_name                                     api
    test_constructor                                                constructors
    test_constructor_scalar                                         constructors
    test_constructor_ndarray                                        constructors
    test_constructor_nonnan                                         constructors
    test_constructor_empty                                          constructors
    test_copy_astype                                                dtypes
    test_shape                                                      api
    test_astype                                                     dtypes
    test_astype_all                                                 dtypes
    test_kind                                                       sparse
    test_to_frame                                                   io
    test_pickle                                                     io
    test_getitem                                                    indexing
    test_get_get_value                                              indexing
    test_set_value                                                  indexing
    test_getitem_slice                                              indexing
    test_take                                                       indexing
    test_numpy_take                                                 indexing
    test_setitem                                                    indexing
    test_setslice                                                   indexing
    test_operators                                                  operators
    test_binary_operators                                           operators
    test_unary_operators                                            operators
    test_abs                                                        operators
    test_reindex                                                    indexing
    test_sparse_reindex                                             indexing
    test_repr                                                       repr
    test_iter                                                       api
    test_truncate                                                   api
    test_fillna                                                     missing
    test_reductions                                                 apply
    test_dropna                                                     missing
    test_homogenize                                                 sparse
    test_fill_value_corner                                          missing
    test_fill_value_when_combine_const                              missing
    test_shift                                                      api
    test_shift_nan                                                  missing
    test_shift_dtype                                                dtypes
    test_shift_dtype_fill_value                                     missing
    test_combine_first                                              api
    test_memory_usage_deep                                          sparse

TestSparseHandlingMultiIndexes
    test_to_sparse_preserve_multiindex_names_columns                indexing
    test_round_trip_preserve_multiindex_names                       indexing

TestSparseSeriesScipyInteraction
    test_to_coo_text_names_integer_row_levels_nosort                coo
    test_to_coo_text_names_integer_row_levels_sort                  coo
    test_to_coo_text_names_text_row_levels_nosort_col_level_single  coo
    test_to_coo_integer_names_integer_row_levels_nosort             coo
    test_to_coo_text_names_text_row_levels_nosort                   coo
    test_to_coo_bad_partition_nonnull_intersection                  coo
    test_to_coo_bad_partition_small_union                           coo
    test_to_coo_nlevels_less_than_two                               coo
    test_to_coo_bad_ilevel                                          coo
    test_to_coo_duplicate_index_entries                             coo
    test_from_coo_dense_index                                       coo
    test_from_coo_nodense_index                                     coo
    test_from_coo_long_repr                                         coo
    test_concat                                                     concat
    test_concat_axis1                                               concat
    test_concat_different_fill                                      concat
    test_concat_axis1_different_fill                                concat
    test_concat_different_kind                                      concat
    test_concat_sparse_dense                                        concat
    test_value_counts                                               analytics
    test_value_counts_dup                                           analytics
    test_value_counts_int                                           analytics
    test_isna                                                       missing
    test_notna                                                      missing

TestSparseSeriesAnalytics
    test_cumsum                                                     analytics
    test_numpy_cumsum                                               analytics
    test_numpy_func_call                                            analytics
    test_deprecated_numpy_func_call                                 analytics
    test_deprecated_reindex_axis                                    analytics

With this I've introduced two new categories as I felt these were necessary, given that some of the tests were very specific. These two are sparse and coo, where sparse is for new properties/features that the sparse series introduce and coo is for SciPy's coordinate format matrices (there are a bunch of tests that are specifically related to this).

Let me know what you think of this categorization, if it looks good I'll start splitting the tests and put up a PR.