radiopaedia / dicom-anonymiser

GNU Affero General Public License v3.0
3 stars 5 forks source link

Cornerstone parser library? #4

Open DanielHeath opened 4 years ago

DanielHeath commented 4 years ago

I'll need to pull in cornerstone's dicomParser library ( https://github.com/cornerstonejs/dicomParser ) to use https://github.com/cornerstonejs/cornerstoneWADOImageLoader .

Would we get any substantial simplifications by using it here as well, or should we just parse twice?

plwp commented 4 years ago

It would fix #8, which is definitely a problem in the parsing library.

It seems to be much better maintained so it'd be a generally good idea to migrate anyway, also less code is better.

Anon.ts and Validator.ts would need to be updated to the new interface but I wouldn't think it'd be super complicated?

DanielHeath commented 4 years ago

Looks like dicom-parser has no distinction between the 'meta' data and the rest (see https://github.com/cornerstonejs/dicomParser/blob/82573d94342e12b4bae4fcd2e93f91bb061474cf/src/parseDicom.js#L149 ).

Do we need a way to figure out which ones belong in the meta and which belong in the p10 datastream when writing?

plwp commented 4 years ago

Dicom doesn’t really make a distinction between metadata and data, it’s all just tags.

DanielHeath commented 4 years ago

This repo has got a dicom dictionary which records what the expected format of all the tags are.

dicomParser doesn't, because it only unpacks elements on-demand and you have to ask it for the right type when you ask it to do the unpack. Instead, it stores the offset in the original stream and how to unpack it.

This is awkward when it comes to replacing a field with a new value (especially if the new value has a different length to the original).

So, converting it to use dicomParser will require a bit of fiddling.

One alternative might be to use dicomParser to pull out the image data, which is the only part that's failing to parse, but that seems pretty unsatisfying; the underlying bug seems likely to affect something else at some point.

DanielHeath commented 4 years ago

Isn’t metadata written before p10 data, sometimes with different endianness?

Thanks, Daniel Heath

On 1 Sep 2020, at 6:33 pm, plwp notifications@github.com wrote:

 Dicom doesn’t really make a distinction between metadata and data, it’s all just tags.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

plwp commented 4 years ago

Looks like you’re right there’s a bunch of tags that sit at the start of the file.

http://dicom.nema.org/dicom/2013/output/chtml/part10/chapter_7.html

I guess we’d just have to whitelist the 0002 tags if that’s the case? They should probably be whitelisted anyway to be fair.

DanielHeath commented 4 years ago

I had a try at swapping in the pixel data from dicom-parser (that is, keeping the existing parse, also reading the tag data via dicom-parser, then comparing them).

Based on that, it appears that the read is happening fine.

Based on instrumenting the writer, the write is also happening fine.

This leaves me marginally confused (perhaps some subtlety of how buffers are implemented is the culprit?)

plwp commented 4 years ago

I see what you mean about the fiddling to make it work...