danforthcenter / plantcv

Plant phenotyping with image analysis
Mozilla Public License 2.0
661 stars 264 forks source link

Data file types for Hyperspectral Workflow. #639

Closed bmsowder closed 3 years ago

bmsowder commented 4 years ago

Hello!

When going through the Hyperspectral Workflow I noticed that the code seems to support only ENVI file types.

Currently the trouble I am having with this is that our hyperspectral camera and software ends up giving us .bsq and .hdr file types. It looks as though in order to convert the file types to ENVI file types you need a third party software to convert.

Has there been other users facing this problem? If so, what seems to work for those who have? Is there any work being done on accepting the common .bsq files to work with the PlantCv Hyperspectral Workflow? how does the PlantCv team work past this hurdle?

If obtaining a third party software to convert .bsq/.hdr into ENVI files, what do you recommend?

Thank you for your time!

maliagehan commented 4 years ago

Hi @bmsowder we currently support the ENVI/hdr file formate type because that is the file type we use and other current users use. But we are interested in expanding the file types accepted by PlantCV. If we could get some sample data to work with/test that would be great.

DannieSheng commented 4 years ago

Hi @bmsowder ! "envi" in the plantcv function "readimage" indicates the mode (or type) of the image, differentiate from other image types such as "rgb", "gray", etc. You can refer to this page for the usage of this function: https://plantcv.readthedocs.io/en/stable/read_image/

For your hyperspectral data, if there is one file called "filename1.hdr" and a corresponding file called "filename1", you should be able to use this function. The file without file extension ("filename1") is the raw data file, and the file ended with ".hdr" is the corresponding header file.

Feel free to correct me if I am wrong.

I personally haven't tested on ".bdq" files. You can try the function on your .hdr file (make sure you have the corresponding raw data file). If that doesn't work, feel free to let me know and share your data if it is possible.

bmsowder commented 4 years ago

when looking at my hyperspectral data files, I do not see a file without file extension ("filename1") along with my header file ("filename1.hdr").

what i do see when I enter the hyperspectral cube file is, I have the header file ("filename1.hdr") and the corresponding hyperspectral data cube ("filename1.bsq")

I tried to attach a sample folder of our data, except the file is too large to send over this comment on github.

nfahlgren commented 4 years ago

Hi @bmsowder, if you could send us an email at plantcv@danforthcenter.org and we can set up a file sharing space to upload an example hyperspectral image to. Thanks!

oetodd commented 3 years ago

@nfahlgren I am now helping with this project, and I was wondering now (running version 3.13), if we can revisit this issue. I am using the zipped notebook: Hyperspectral workflow.ipynb.zip and when running the .bsq data, am receiving the following error: "ValueError: cannot reshape array of size 9585429 into shape (48,331,591)"

As @bmsowder mentioned, there appears to be no filename without an extension. Based on other tickets and my own testing, it seems that the .hdr is required for the input image. This may be an error I need to address before starting the workflow. Attached below are the three files (assumedly) created by the processing software Cube.zip.

Do you have any thoughts on how to proceed?

nfahlgren commented 3 years ago

Hi @oetodd, I need to test again with the data you attached (thanks!) but I recall there were two issues we ran into before. One was that we did not support BSQ data, but now do. The other was that there is a discrepancy between the .hdr file and the .bsq file. According to the .hdr file, the size of the data file should be:

lines (height): 331 samples (width): 591 bands (depth): 48

331 591 48 = 9389808

But the size of the file is 9585429, and if we divide this by the height * width:

9585429 / (331*591) = 49

This means there is one extra band/frame in the data cube relative to what the metadata file says there should be.

oetodd commented 3 years ago

Thanks @nfahlgren! I am new to hyperspectral data analysis so I'll be following up on these issues you pointed out in the next few hours. @bmsowder pointed out that our analysis pipeline makes an additional composite image, so we think this is where the extra frame is coming from.

nfahlgren commented 3 years ago

Looks like the last frame is the composite frame?

nfahlgren commented 3 years ago

I don't know a way. The issue, I think, is that the band would need to be removed and then the data cube would need to be written back out to create a binary BSQ file, and I'm not sure how to do that.

The other two options would be:

  1. It's possible to work around pcv.readimage and create a Spectral_data object manually. Not an ideal solution.
  2. Update the data to account for the extra frame: this would require editing the .hdr file to add 1 to the bands and add an extra item to the wavelength set.
  3. We update PlantCV to account for extra bands. This is a little bit tricky because we don't know to expect a composite frame based on the metadata. In principle we could divide the data length by height*width and see if the remainder matches the number of bands or not, and if not do something with the remainder bands (should we discard it/them or something else?).
oetodd commented 3 years ago

@nfahlgren I opted for adding in one extra band. I changed bands in the .hdr file from 48 to 49, and duplicated the last value in wavelength. With the software we are using for analysis (BaySpec CubeCreator), there is no good way to get rid of the composite image mid-processing. For now this is a work-around that isn't that difficult. This allowed for the rest of the notebook attached above to execute without errors. The following two lines where the error was coming from is now working:

spectral_array = pcv.readimage(filename=args.image, mode='envi') 
filename = spectral_array.filename
nfahlgren commented 3 years ago

Nice, thanks @oetodd!