asyml / ForteHealth

The project is in the incubation stage and still under development. ForteHealth is a flexible and powerful ML workflow builder for biomedical and clinical scenarios. This is part of the CASL project: http://casl-project.ai/
Apache License 2.0
10 stars 5 forks source link

xray processor #67

Closed bhaskar2443053 closed 2 years ago

bhaskar2443053 commented 2 years ago

This PR fixes #60 and also fixes #68

Description of changes

This a new processor for x-ray image classification of PNEUMONIA disease. It uses the HuggingFace "nickmuchi/vit-finetuned-chest-xray-pneumonia" model to make predictions.

Possible influences of this PR.

As a side effect the ICD processor test is failed. this PR also fixes it.

Test Conducted

Describe what test cases are included for the PR.

codecov[bot] commented 2 years ago

Codecov Report

Merging #67 (49c1555) into master (71b8d78) will increase coverage by 1.34%. The diff coverage is 94.80%.

@@            Coverage Diff             @@
##           master      #67      +/-   ##
==========================================
+ Coverage   86.54%   87.89%   +1.34%     
==========================================
  Files          10       14       +4     
  Lines         617      735     +118     
==========================================
+ Hits          534      646     +112     
- Misses         83       89       +6     
Impacted Files Coverage Δ
...tex/health/processors/icd_coding_processor_test.py 95.65% <66.66%> (-4.35%) :arrow_down:
fortex/health/processors/xray_image_processor.py 94.44% <94.44%> (ø)
...ts/fortex/health/processors/xray_processor_test.py 96.42% <96.42%> (ø)
fortex/health/processors/icd_coding_processor.py 93.61% <100.00%> (+0.59%) :arrow_up:
...ts/fortex/health/readers/xray_image_reader_test.py 95.45% <0.00%> (ø)
fortex/health/readers/xray_image_reader.py 96.29% <0.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

Piyush13y commented 2 years ago

@bhaskar2443053 Looks good in the first glance. However, you should add a unit test for this processor in the same PR, so we can ensure the correctness of it as well.

bhaskar2443053 commented 2 years ago

@bhaskar2443053 Looks good in the first glance. However, you should add a unit test for this processor in the same PR, so we can ensure the correctness of it as well.

added the test case. But facing some dependency installation issues. Could you look into it?

Piyush13y commented 2 years ago

@bhaskar2443053 Looks good in the first glance. However, you should add a unit test for this processor in the same PR, so we can ensure the correctness of it as well.

added the test case. But facing some dependency installation issues. Could you look into it?

Seems like a connection issue with Azure. Lets wait for a day, and see if its resolved on its own. Moreover, I don't see the examples/xray/sample_data/folder created in the master repo. Could you help me with what data the tests are running on?

Update: The CI pipeline is going through now, Azure must have down for sometime causing the previous issue.

bhaskar2443053 commented 2 years ago

@bhaskar2443053 Looks good in the first glance. However, you should add a unit test for this processor in the same PR, so we can ensure the correctness of it as well.

added the test case. But facing some dependency installation issues. Could you look into it?

Seems like a connection issue with Azure. Lets wait for a day, and see if its resolved on its own. Moreover, I don't see the examples/xray/sample_data/folder created in the master repo. Could you help me with what data the tests are running on?

added the sample data folder to this PR. It contains two x-ray image files with the class name as the filename.

bhaskar2443053 commented 2 years ago

@Piyush13y could you review again !

Piyush13y commented 2 years ago

I don't think the test cases are running at all, neither the one for processor nor the one that has been merged for X-ray reader. You should add the test files to our workflow to include them in the pipeline, here. You can check what tests are running in the pipeline , lets ensure Xray test cases are actually being executed.

bhaskar2443053 commented 2 years ago

I don't think the test cases are running at all, neither the one for processor nor the one that has been merged for X-ray reader. You should add the test files to our workflow to include them in the pipeline, here. You can check what tests are running in the pipeline , lets ensure Xray test cases are actually being executed.

@Piyush13y I have add the test to workflow. and set the forte version to 0.3.0.dev2 . but it still shows the error AttributeError: 'DataPack' object has no attribute 'add_image'

image

any solution to deal with this issue ?

Piyush13y commented 2 years ago

I don't think the test cases are running at all, neither the one for processor nor the one that has been merged for X-ray reader. You should add the test files to our workflow to include them in the pipeline, here. You can check what tests are running in the pipeline , lets ensure Xray test cases are actually being executed.

@Piyush13y I have add the test to workflow. and set the forte version to 0.3.0.dev2 . but it still shows the error AttributeError: 'DataPack' object has no attribute 'add_image'

image

any solution to deal with this issue ?

That's because even forte 0.3.0.dev2 doesn't have these updated changes for image handling in data pack. We might have to add a new forte dev release, but feasibility of that can depend on multiple factors. Let me check with Hector and get back to you.

bhaskar2443053 commented 2 years ago

@Piyush13y with forte version 0.3.0.dev3, the xray reader test is passed

image

but the ICD coding test is failing.

image

dev3 version is behaving like the installation from master.

Piyush13y commented 2 years ago

@Piyush13y with forte version 0.3.0.dev3, the xray reader test is passed image

but the ICD coding test is failing. image

dev3 version is behaving like the installation from master.

@J007X, can you please look into this since it's throwing some error in the ICD Coding Processor?

J007X commented 2 years ago

@Piyush13y the ICD Coding Processor was not changed recently and the error can not be reproduced locally. Looking from the log of running python 3.7 it seemed download not finished and somehow it times out?

bhaskar2443053 commented 2 years ago

@Piyush13y the ICD Coding Processor was not changed recently and the error can not be reproduced locally. Looking from the log of running python 3.7 it seemed download not finished and somehow it times out?

@J007X @Piyush13y the model loading from the transformer pipeline was not working correctly. modified that by loading model and tokenizer separately. It works now.

bhaskar2443053 commented 2 years ago

@Piyush13y could you review again. All CL jobs passed now.

Piyush13y commented 2 years ago

@hunterhector The ICD Coding processor was not working with the latest forte pre release 0.3.0.dev3, Bhaskar had to update the processor to load Tokenizer and model separately for it to work. Not sure if that is due to a bug in the release or just our config, wanted to keep you updated. Or maybe we have just updated the version of transformers, which caused the issue.