Open atolopko-czi opened 1 year ago
Added unit tests for various batching cases, sparse/dense output, and encoders
@pablo-gar (cc: @atolopko-czi @ebezzi ) - Is this intentionally left open to serve as a placeholder for unit test tasks for pytorch? I ask because the ticket is was created in May 2023.
If so, should we create tickets for specific test cases for pytorch
? For example, we don't have any testing for the scatter-gather-shuffle
algorithm - It is a non-deterministic algorithm and testing it needs some thought. Should we make a specific ticket for this case and link it to this ticket?
This is definitely an old ticket!
There are two stubbed/ignored tests that are intended to test coverage for https://github.com/chanzuckerberg/cellxgene-census/blob/96c0499cd83521d169220f37555724bc9c67f114/api/python/cellxgene_census/src/cellxgene_census/experimental/ml/pytorch.py#L547-L555. But honestly, we need to eliminate the multi-worker support entirely (it's redundant w/TileDB threaded I/O), so IMO those test stubs can be deleted.
Regarding the other two test suggestions, above:
The obs / X data is joined correctly (X rows correspond to their respective obs soma_joinids)
This appears to be implicitly tested by various methods such as test_batching__all_batches_full_size
, test_batching__exactly_one_batch
, and test_experiment_dataloader__batched
, where both the obs and X contents of a batch are asserted in tandem. Could still be worth adding a test that is named appropriately to explicitly capture the requirement, e.g. test_obs_and_X_are aligned
.
The var / X data joined correctly (gene ordering matches var query filter)
Similarly, this is implicitly tested by tests that assert the ordered values within a row of X, but I would add an explicit test for this. And particularly for sparse tensors, I would test that the expected data at a given var coordinate is correct.
Census PyTorch ExperimentDataPipe unit test coverage is already substantial. However, there are a few stubbed tests that could implemented.
In addition add tests for: