Closed hu3402379 closed 1 year ago
hi @hu3402379,
Ive just had a look at this issue...
seems Daikon does not support anything other than ascii at the moment.
I have played around with utf-8, big5, gbw and gb18030
none seem to be perfect, but much improved in different ways!
seems others have the same issue: https://github.com/cornerstonejs/dicomParser/issues/89
i have created a branch on my fork https://github.com/nickhingston/Daikon.git#20-chinese-garbled
worth having a play to see which works best, I cant find out how to auto detect.
one of your files seem to have a different encoding for patientName and address! but as I cant read Chinese i've no idea what is correct!
to change the encoding set daikon.Utils.utfLabel = "encoding";
the encoding list is here: https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/TextDecoder
cheers
nick
@nickhingston ,aha,thanks for your reply.
This question is also my proposal:cornerstonejs/dicomParser#89. and 'iconv-lite' can solve this issue.
I have tryed to add this lib in daikon, but I don't know why, I downloaded daikon, and the compilation is not successful. Can you help me to import this library 'iconv-lite' in daikon and then convert the patient name. Then provide me with the compiled daikon.js.
thanks a lot.
ah, ok! Seems my unzip or OSX can't cope with Chinese filenames either!
I have pushed an updated version using iconv-lite to that same branch.
It has exactly the same problems as using TextDecoder
using 'gbk' or 'gb18030' i get (0010,0010) Patient Name REMOVED (0010,1040) Patient Address REMOVED
& (0010,0010) Patient Name REMOVED
which looks close-ish!
using utf-8: (0010,1040) Patient Address REMOVED But, patient names are very garbled in both files!
What were the settings you used to get iconv to work for everything?! the settings i have pushed to my branch is 'gb18030'
I guess the other question is, what build errors are you getting from npm run build?
actually please tell me our postings are OK from a privacy perspective?!
maybe we need to talk directly about this and files/names/address removed...
@nickhingston
1. What were the settings you used to get iconv to work for everything?!
'gbk' or 'gb18030' is correct.
(0010,0010) Patient Name [**=�$)A**�-**] (0010,0010) Patient Name [**(**)]
'utf-8' is correct
(0010,1040) Patient Address [**105-101]
'gbk' or 'gb18030' is wrong.
(0010,1040) Patient Address [105-101]
I don't know how to use iconv work for every tag yet... It's very strange, each tag is a different encoding.
2. npm run build error:
@nickhingston Thanks for your advice. This is not good from a privacy perspective, now, we can delete this information. I have edit commit replace key info with ****
yeah it doesn't make sense to me! editing to remove personal info... i guess you should remove the files from cornerstone's parser too!!
i cant edit your posts...please remove personal info from there.
If you want to contact me directly to continue any conversation that may involve personal info...use my email, i've just made it public on my profile...
did you do npm install
first? can you provide the debug.log file the error message references?
what version of node are you using?
@nickhingston I have delete the personal info. I have run 'npm install' before 'npm run build' node verison: v8.9.3 npm version: v6.1.0
there is still the patient address (copied from mine) in one of your posts:
'gbk' or 'gb18030' is wrong.
(0010,1040) Patient Address
yes but it references a debug.log file...
@nickhingston I have submit debug.log file in last comment. You can help me find the reason of the error. thanks
yeah it looks like you havn't run npm install
was there any error from that?
@nickhingston Actually, I did it, and everything looked correct.
Ah, think I see the problem! Looks like rm failed (load of Chinese after it? )
you need to run this in a unix like shell. As you are using windows - Mingw or something....
Sent from my iPhone
On 13 Jul 2018, at 03:30, hu3402379 notifications@github.com wrote:
@nickhingston Actually, I did it, and everything looked correct.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@nickhingston yeah. My environment is windows. I will try it . thanks .
@nickhingston I have run build in babun. but still get error. and I have install browserify global.
hmm, seems rm thinks that option is for it, it isn't!
the command is multiline
rm -rf build; mkdir build; browserify --standalone daikon src/main.js -o build/daikon.js; uglifyjs build/daikon.js -o build/daikon-min.js
the option should be for browserify...so something odd with your shell still. So either use a 'proper' OS, or try running these commands separately!
rm -rf build
mkdir build
browserify --standalone daikon src/main.js -o build/daikon.js
uglifyjs build/daikon.js -o build/daikon-min.js
@nickhingston I compile commands separately successful, then i replace daikon.js in rii-mango/papaya and run './papaya-builder.sh' faild. if I dont replace, it will be success.
rm -rf build mkdir build browserify --standalone daikon src/main.js -o build/daikon.js uglifyjs build/daikon.js -o build/daikon-min.js
@nickhingston I have solve it. Maybe it is Papaya's build ruler error. 'byte' is a reserved word. in 'daikon.js' source code:
// Checks the type of a UTF-8 byte, whether it's ASCII, a leading byte, or a // continuation byte. If an invalid byte is detected, -2 is returned. function utf8CheckByte(byte) { if (byte <= 0x7F) return 0;else if (byte >> 5 === 0x06) return 2;else if (byte >> 4 === 0x0E) return 3;else if (byte >> 3 === 0x1E) return 4; return byte >> 6 === 0x02 ? -1 : -2; }
Modified code code:
// Checks the type of a UTF-8 byte, whether it's ASCII, a leading byte, or a // continuation byte. If an invalid byte is detected, -2 is returned. function utf8CheckByte(val) { if (val <= 0x7F) { return 0; } else if (val >> 5 === 0x06) { return 2; } else if (val >> 4 === 0x0E) { return 3; } else if (val >> 3 === 0x1E) { return 4; } else { return val >> 6 === 0x02 ? -1 : -2; } }
in addition, 'dicom-parser' contributor tell me the good news, he has released an npm package ‘dicom-character-set’ to convert all encodings. If you want to, you can try。
@hu3402379 Good news! I just released an npm package to convert bytes from a DICOM text attribute into a JS string. It'll handle all of the encodings, including with extensions. Check it out here.
ok, interesting - that makes it a Papaya bug then - Daikon does not have this method!
I suppose the question should be - where should the conversion take place?! My preference would be in the parser itself. I which case this Issue still stands...
@martinezmj what do you think?
will look into dicom-character-set - will certainly be useful for my project (which does not use Papaya).
I am also considering forking off Daikon more 'officially' and calling it something like DICOMgpu.js...because i can't seem to get any response to my PRs on this project, and I think people would like a GPU accelerated javascript DICOM decoder!
@hu3402379 looked into dicom-character-set.
iconv seems to do a better job for your files than this...
have you any luck with it in the other lib you are using?
FYI, i'm using it like this: .convertBytes('ISO 2022 IR 100\ISO 2022 IR 58', new Uint8Array(strBuff), {vr: 'LT'});
iconv sorts out both the name and address i think, but I get "XU ER ZHU=$)A" & "-A=XU ER ZHU" around the name. Is this supposed to be there?!
@nickhingston
Regardless of whether this result is converted from iconv or dicom-character-set, I think the patient name itself should be garbled.
So, the dicom file patient name is not correct coding.
I think both iconv of dicom-character-set is nice to solve chinese garbled.
OK thanks though I think its the other way around i think the address is not in the specified ('ISO 2022 IR 100\ISO 2022 IR 58) encoding
dicom-character set, doesn't work - iconv does (at least to some degree - it’s a bit ‘cleverer’ at working out unknowns).
I’ll create a PR on using the iconv solution. Thanks for your input on this.
On 30 Jul 2018, at 10:13, hu3402379 notifications@github.com wrote:
@nickhingston https://github.com/nickhingston Regardless of whether this result is converted from iconv or dicom-character-set, I think the patient name itself should be garbled.
So, the dicom file patient name is not correct coding.
I think both iconv of dicom-character-set is nice to solve chinese garbled.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rii-mango/Daikon/issues/20#issuecomment-408799184, or mute the thread https://github.com/notifications/unsubscribe-auth/AA059WnNxhveKs66Uf4zTiGhiitycB4dks5uLs47gaJpZM4VE8DR.
@nickhingston Do you know about papaya's mpr function, I want to ask you some questions later. Do you mind leaving your email? We communicate via email.
Thanks a lot
i'm afraid not. I don't use papaya.
Feel free to email directly - details should be on my profile...
On 30 Jul 2018, at 12:38, hu3402379 notifications@github.com wrote:
@nickhingston https://github.com/nickhingston Do you know about papaya's mpr function, I want to ask you some questions later. Do you mind leaving your email? We communicate via email.
Thanks a lot
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rii-mango/Daikon/issues/20#issuecomment-408834312, or mute the thread https://github.com/notifications/unsubscribe-auth/AA059Yc8B3hSUHc3a8ZH6MARUnQTZVsFks5uLvAagaJpZM4VE8DR.
I agree this should be solved in Daikon parser rather than Papaya. Did the proposed solution work? Was there a PR that came out of this? If not, I'll take a look at it next. Sorry for my late reply.
No not a specific solution, iconv works if you hardcode the format, but there is no obvious way of converting from the DICOM spec to iconv format name. The other library didn't work for me at all…
On 13 Aug 2018, at 01:17, RII-Mango notifications@github.com wrote:
I agree this should be solved in Daikon parser rather than Papaya. Did the proposed solution work? Was there a PR that came out of this? If not, I'll take a look at it next. Sorry for my late reply.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rii-mango/Daikon/issues/20#issuecomment-412381800, or mute the thread https://github.com/notifications/unsubscribe-auth/AA059SQPFdzMOn6hz9Z_1qRgXKIO0QYAks5uQMWwgaJpZM4VE8DR.
@rii-mango do you have any solution for that? i have similar trouble with korean, and patient name return have added some special symbols
hi, @martinezmj ,I have some dicom files, some file's patient name is chinese, and
daikon.Image.prototype.getPatientName
will return garbled. I do not know how to solve this issue.The attch file is garbled dcm file for test. Thanks for your help..