Open sampan501 opened 3 years ago
Basically, like in this test, but for all the tests in the independence and k-sample module
@sampan501 Can I do this one?
Basically, like in this test, but for all the tests in the independence and k-sample module
@sampan501 What should be done here? The description is a little bit vague. I checked the module still not sure about the required modifications.
So, Dcorr
supports block permutation using the perm_blocks
parameter. I want that same functional, but for K-sample Dcorr. So basically add perm_blocks
like auto
here: https://github.com/neurodata/hyppo/blob/135cfcd6b13a986b09ec7629931bce8676ea5547/hyppo/ksample/ksamp.py#L299
@sampan501 You mean this part: https://github.com/neurodata/hyppo/blob/135cfcd6b13a986b09ec7629931bce8676ea5547/hyppo/independence/dcorr.py#L222-L258
for all the tests in the independence and k-sample module?
That and https://github.com/neurodata/hyppo/blob/135cfcd6b13a986b09ec7629931bce8676ea5547/hyppo/independence/dcorr.py#L253 this for k-sample tests https://github.com/neurodata/hyppo/blob/135cfcd6b13a986b09ec7629931bce8676ea5547/hyppo/independence/base.py#L138 and modify this unit test to test your changes https://github.com/neurodata/hyppo/blob/5e0fe5e4337d5997f2fb1b313cd8e94540aa8807/hyppo/tools/tests/test_common.py#L265
That and
this for k-sample tests https://github.com/neurodata/hyppo/blob/135cfcd6b13a986b09ec7629931bce8676ea5547/hyppo/independence/base.py#L138
and modify this unit test to test your changes https://github.com/neurodata/hyppo/blob/5e0fe5e4337d5997f2fb1b313cd8e94540aa8807/hyppo/tools/tests/test_common.py#L265
I don't have extensive knowledge about unit testing. I need a little bit more instruction on these. or maybe a link to learn more about these? @sampan501
Can you assign me this issue please? And where do I call the test_permtest function?
The test_permtest
function is called by pytest. I think the above links explain how the process works. Let me know if you have any questions
Hi i am new to open source , is this issue still open ?
i wanna work on this issues
@sampan501 thanks sir for assigning me i am working on that
hey, can I help implement this for some of the tests in the independence folder?
Yes please
thanks!
I'm new to OS and I had some issues setting up hyppo/running the tests locally. I installed the dependencies in dev-requirements.txt and tried running pytest, but it had errors collecting a bunch of files.
Am I missing some dependencies? If possible, could you point me to some resources for getting started running Hyppo locally? I tried installing the docs dependencies and building them but I got this error:
ImportError: cannot import name 'Union' from 'types' (/usr/lib/python3.10/types.py)
Hi, just wanted to bump to ask again about running Hyppo in editable mode.
I cloned the repo, installed dependencies in requirements.txt, dev-requirements.txt, docs/requirements.txt, ran pip install -e . , then ran pytest. I got errors finding files in benchmarks/ and an error important 'Union' from 'types' from python3.10.
I also wasn't able to build the doc files. I installed the doc requirements.txt to my virtual environment then ran make html, getting the same error importing 'Union' from 'types' in python3.10.
Is there something wrong with my workflow? Or should I use a different python version? I tried python3 as well with the same error.
Thanks again!
Hi, sorry for not getting back to you sooner, was a little bit busy. Usually I run pytest on just the "hyppo" folder in the repo and not the entire package
Really pytest is used on when running the unit tests in the package, which are all contained within the tests
folders in each module
gotcha! Thanks for getting back to me. Should I just test the independence folder after making changes? Or is there a way for me to test the entire package? Sorry, not sure about best practices for contributing.
The tests in independence passed successfully for me 🥳, so I'll get started
I would test the methods and modules that you are training and create a new test when you add your code. The code will also automatically build when you make a commit to a PR in CircleCI
Hello everyone, I'm new to open source, if this issue is still open, can you assign it to me?
Is this issue solved?
@mahimairaja Not yet, I have not had a PR open about it yet
Hi. Could you please assign me this issue? I would like to contribute to this and I'm sure I can help you out.
Have you tried this? I've been going through the links you provided and I just analysing it.
check_perm_blocks_dim
function.chi2_approx
function, which computes an approximate chi-square statistic.compute_dist
function.Dcorr
that inherits from IndependenceTest
.__init__
method of the Dcorr
class, where you can specify the compute_distance
parameter.statistic
method of the Dcorr
class, which calculates the Dcorr test statistic.p_value
method of the Dcorr
class, which computes the p-value for the test statistic.IndependenceTestOutput
class to store the results of the independence test.Basically, perm_blocks has been implemented for one test (Dcorr) and we want it implemented the same way for multiple tests. I believe they things you commented are already in the package
If you have already performed those steps, then to implement the block permutation for multiple tests, you can follow a similar approach as you did for the Dcorr test. For each test, you can follow the same instructions you used for the Dcorr test as I mentioned above. Just make sure to adapt the code to the specific requirements of each test. By repeating the steps for each test, you'll be able to implement the block permutation effectively to multiple tests
To modify the code accordingly you might want to change the distance metric used in the compute_distance
function or adding additional functions specific to each test.
You might consider modifying the distance metric in the compute_distance
function:
if distance_metric == "euclidean":
# Calculate Euclidean distance between data_point1 and data_point2
distance = math.sqrt(sum((x - y) ** 2 for x, y in zip(data_point1, data_point2)))
elif distance_metric == "manhattan":
# Calculate Manhattan distance between data_point1 and data_point2
distance = sum(abs(x - y) for x, y in zip(data_point1, data_point2))
else:
raise ValueError("Invalid distance metric")
return distance
The stuff you are linking seem completely unrelated to this issue. Please take a look through the code I provided above in the issue and see if that makes sense to you. If it doesn't, lmk
The link I've accessed is the first one you provided which contains all the packages you imported and Dcorr
Basically, perm_blocks has been implemented for one test (Dcorr) and we want it implemented the same way for multiple tests. I believe they things you commented are already in the package
So from your last comment, I understood that you wanted to implement perm_blocks to multiple texts, right?
Am I missing out on anything?
The link I provided is the implementation of Dcorr
within hyppo
. I just want a similar implementation to that for all other independence tests within hyppo
. Does that make sense?
Could you provide me with more details about the specific independence tests you're interested in? There are several other independence tests within hyppo that could benefit from a similar implementation. Some of these tests include the CCA test, HHG test, and MGC test, among others. By following a similar approach as the Dcorr implementation, we can ensure consistency across these different tests.
I am just want to make sure that it's exactly what you mean.
Let's start with one, maybe CCA, and then add more to the PR when that gets done
Great. I feel like it's better to break it down and then we can move on to other tests. Since you want to focus on CCA for now, what has your approach been like?
Have you tried anything like this?
from hyppo.independence import CCA
# Generate your data
# X and Y should be numpy arrays or pandas DataFrames
# Create an instance of the CCA test
cca_test = CCA()
# Perform the CCA independence test
test_statistic, p_value = cca_test.test(X, Y)
# Print the results
print("Test Statistic:", test_statistic)
print("P-value:", p_value)
I'm just importing the `CCA` class from `hyppo.independence` and creating an instance of it. Then, you can perform the CCA independence test by calling the `test` method on the test instance, passing in your data `X` and `Y`. The test will return the test statistic and p-value, which you can then use or print as needed.
Please lmk if any changes are required and what method you've tried so far
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
Describe alternatives you've considered
Additional context (e.g. screenshots)