jhu-bids / TermHub

Web app and CLI tools for working with biomedical terminologies. https://github.com/orgs/jhu-bids/projects/9/views/7
https://bit.ly/termhub
GNU General Public License v3.0
10 stars 10 forks source link

Tests: Have LLMs create many for us #819

Open joeflack4 opened 3 months ago

joeflack4 commented 3 months ago

Overview

Apparently GPT 4.0 (IDK about 3.5 or 4o, or other LLMs) is really good at creating unit tests. Claude 3.5 just came out and supposedly is the current top ranking LLM for most things, including code, though there tends to be a new champion every month.

We can't just copy/paste the whole codebase into the prompt yet, but we could do single files at a time. I think this could be a great idea to quickly (i) stress test the codebase (it can find good edge cases), (ii) future proof errors as a result of changes.

Siggie suggests that uploading .py files will yield better results than simply copy/pasting them into the prompt.

Candidate files to test

I'm not really sure. But some files are harder to test, e.g. the 'routes' ones would require server and DB to be up, or they have to be mocked, like w/ the mock tests we've made below for the objects API.

Progress so far

1. enclave_wrangler.objects_api

https://claude.ai/chat/7f64ea42-7ab2-43c6-9a86-382bdafc26bf

I am uploading a Python file. Can you write as many tests as possible for it?

I implemented some of the response, but I haven't implemented the following in TestObjectsApiMocks.

These mock tests require time to set up. They don't just work out of the box. See TestObjectsApiMocks for reference.

Details

```py @patch('enclave_wrangler.objects_api.make_objects_request') def test_download_all_researchers(self, mock_make_objects_request): mock_make_objects_request.return_value = [{'name': 'Researcher1'}, {'name': 'Researcher2'}] result = download_all_researchers() self.assertEqual(result, [{'name': 'Researcher1'}, {'name': 'Researcher2'}]) @patch('enclave_wrangler.objects_api.make_objects_request') def test_get_researcher(self, mock_make_objects_request): mock_make_objects_request.return_value = {'name': 'TestResearcher'} result = get_researcher('123') self.assertEqual(result, {'name': 'TestResearcher'}) @patch('enclave_wrangler.objects_api.make_objects_request') def test_get_projects(self, mock_make_objects_request): mock_make_objects_request.return_value = [{'name': 'Project1'}, {'name': 'Project2'}] result = get_projects() self.assertEqual(result, [{'name': 'Project1'}, {'name': 'Project2'}]) @patch('enclave_wrangler.objects_api.fetch_object_by_id') def test_fetch_cset_version(self, mock_fetch_object_by_id): mock_fetch_object_by_id.return_value = {'id': 1, 'name': 'TestCsetVersion'} result = fetch_cset_version(1) self.assertEqual(result, {'id': 1, 'name': 'TestCsetVersion'}) @patch('enclave_wrangler.objects_api.make_objects_request') def test_fetch_cset_container(self, mock_make_objects_request): mock_make_objects_request.return_value = {'id': 'abc', 'name': 'TestContainer'} result = fetch_cset_container('abc') self.assertEqual(result, {'id': 'abc', 'name': 'TestContainer'}) @patch('enclave_wrangler.objects_api.fetch_object_by_id') def test_fetch_cset_member_item(self, mock_fetch_object_by_id): mock_fetch_object_by_id.return_value = {'id': 1, 'name': 'TestMemberItem'} result = fetch_cset_member_item(1) self.assertEqual(result, {'id': 1, 'name': 'TestMemberItem'}) @patch('enclave_wrangler.objects_api.fetch_cset_member_item') def test_fetch_concept(self, mock_fetch_cset_member_item): mock_fetch_cset_member_item.return_value = {'id': 1, 'name': 'TestConcept'} result = fetch_concept(1) self.assertEqual(result, {'id': 1, 'name': 'TestConcept'}) @patch('enclave_wrangler.objects_api.fetch_object_by_id') def test_fetch_cset_expression_item(self, mock_fetch_object_by_id): mock_fetch_object_by_id.return_value = {'id': 1, 'name': 'TestExpressionItem'} result = fetch_cset_expression_item(1) self.assertEqual(result, {'id': 1, 'name': 'TestExpressionItem'}) @patch('enclave_wrangler.objects_api.sql_query_single_col') @patch('enclave_wrangler.objects_api.fetch_all_csets') def test_find_missing_csets_within_threshold(self, mock_fetch_all_csets, mock_sql_query_single_col): mock_sql_query_single_col.return_value = [1, 2] mock_fetch_all_csets.return_value = [ {'codesetId': 1, 'createdAt': '2023-01-01T00:00:00Z'}, {'codesetId': 2, 'createdAt': '2023-07-01T00:00:00Z'}, {'codesetId': 3, 'createdAt': '2023-07-01T00:00:00Z'}, ] result = find_missing_csets_within_threshold(30, self.mock_connection) self.assertEqual(len(result), 1) self.assertIn(3, result) @patch('enclave_wrangler.objects_api.make_objects_request') @patch('enclave_wrangler.objects_api.get_concept_set_version_expression_items') @patch('enclave_wrangler.objects_api.items_to_atlas_json_format') def test_get_codeset_json(self, mock_items_to_atlas_json, mock_get_items, mock_make_objects_request): mock_make_objects_request.side_effect = [ {'conceptSetNameOMOP': 'TestSet'}, {'name': 'TestContainer'}, ] mock_get_items.return_value = [{'id': 1}, {'id': 2}] mock_items_to_atlas_json.return_value = [{'concept': {'CONCEPT_ID': 1}}, {'concept': {'CONCEPT_ID': 2}}] result = get_codeset_json(1, self.mock_connection, use_cache=False) self.assertIn('concept_set_container', result) self.assertIn('version', result) self.assertIn('items', result) @patch('enclave_wrangler.objects_api.get_object_links') def test_get_concept_set_version_expression_items(self, mock_get_object_links): mock_get_object_links.return_value = [{'properties': {'itemId': 1}}, {'properties': {'itemId': 2}}] result = get_concept_set_version_expression_items(1, 'full') self.assertEqual(len(result), 2) self.assertEqual(result[0]['properties']['itemId'], 1) @patch('enclave_wrangler.objects_api.get_object_links') def test_get_concept_set_version_members(self, mock_get_object_links): mock_get_object_links.return_value = [{'properties': {'conceptId': 1}}, {'properties': {'conceptId': 2}}] result = get_concept_set_version_members(1, 'full') self.assertEqual(len(result), 2) self.assertEqual(result[0]['properties']['conceptId'], 1) # Note: this one doesn't have a 'patch' decorator but is a mock test def test_items_to_atlas_json_format(self): items = [ {'conceptId': 1, 'includeDescendants': True, 'includeMapped': False, 'isExcluded': False}, {'conceptId': 2, 'includeDescendants': False, 'includeMapped': True, 'isExcluded': True}, ] with patch('enclave_wrangler.objects_api.get_concepts') as mock_get_concepts: mock_get_concepts.return_value = [ {'concept_id': 1, 'concept_name': 'Test1', 'domain_id': 'Test', 'concept_class_id': 'Test'}, {'concept_id': 2, 'concept_name': 'Test2', 'domain_id': 'Test', 'concept_class_id': 'Test'}, ] result = items_to_atlas_json_format(items) self.assertEqual(len(result), 2) self.assertEqual(result[0]['includeDescendants'], True) self.assertEqual(result[1]['isExcluded'], True) ```