Closed GoogleCodeExporter closed 9 years ago
Oops, just saw that the fonts are taken from QCad. So I guess they do not
contain the
necessary non-ASCII characters.
So, I have to look, if I can find some fonts with german "umlaute" and see if
this
will work.
Original comment by g.mue...@gmail.com
on 6 Jun 2010 at 10:05
I would be curious to know if this works too. For the CXF fonts, the character
'name' is normally expressed as a single character between the square brackets.
There are examples (the RomanCS font is one) where the character is specified
as a
'#' character followed by four hexadecimal numbers. I have assumed that these
can be
plugged straight into the 'wxString::ToLong()' call with a base-16 parameter
(See
line 538 of CxfFont.cpp). I have no idea whether this correct or not.
If you can give me a specific example where a particular character is in the
font
file but it is not presented correctly I would be happy to repair the problem.
Original comment by David.Ni...@gmail.com
on 7 Jun 2010 at 2:06
Hi David
Think he means "special" Characters like üöäÜÖÄ
e.g. courier.cxf line 758 the "Ö" Character
I have done a Sreenshoot from this...
This Problem exists also with Qcad free Version (2.0).
They have fixed this in comercial Version from Qcad 2.2
Hans
Original comment by aachen.h...@googlemail.com
on 7 Jun 2010 at 11:41
Attachments:
Hans,
could you please add the Heeks file to this fault? I don't know how to add text
with these special characters.
Thanks
David Nicholls
Original comment by David.Ni...@gmail.com
on 7 Jun 2010 at 1:32
David,
First what I see is the abouve Screenshoot:
Left in Properties the correkt Characters and right on Canvas the wrong...
When I save the File and load it again it is exactly reversed ...
Left in Properties the wrong Characters and right on Canvas the correkt...
Hans
Original comment by aachen.h...@googlemail.com
on 7 Jun 2010 at 7:36
Attachments:
Hans,
you are right. I was actually trying to make a door bell plate with my last
name engraved (containing an 'ü'-Umlaut).
When saving to XML, this character is saved in ISO-Latin1 encoding (0xFC in
this case).
Could there be a problem with Ctt() and Ttc() from strconv.cpp? It seems that
std::wstring::push_back() does the magic of converting from char to wchar_t.
Sorry, for not digging deeper yet, but I am not into this unicode and character
sets stuff (yet).
Guido
Original comment by g.mue...@gmail.com
on 7 Jun 2010 at 9:46
David, Hans,
I tried the following:
In HText::ReadFromXMLElement(TiXmlElement* pElem) I changed:
text.assign(Ctt(a->Value()));
to
text = wxString::From8BitData(a->Value());
After loading a *.heeks with a text object containing ISO-Latin1, this actually
restores the original state: Properties displaying the correct string (with
non-ASCII chars), but the empty rectangle is again displayed in the drawing
window.
So the font lookup just needs the ISO-Latin1 encoding, but wxWidgets uses its
own internal encoding. So there is probbly just a conversion back to ISO-Latin1
needed before the font lookup.
It also seems to be better to use wxString::From8BitData() and To8BitData()
instead of Ctt() and Ttc(). But this "only" converts from/to ISO-8859-1. But
what about east european character sets?
Would it be feasible to to store the text object strings already in Unicode? Is
the font lookup capable of addressing more than ISO-8859-1?
Guido
Original comment by g.mue...@gmail.com
on 8 Jun 2010 at 9:48
Guido,
well done. From your lead, I looked at the way we were reading the character 'names' in from the CXF font files and I was making the same mistake there. I had written a 'ss_to_wstring' routine in the strconf.cpp file and it was also only using 7 bits rather than the full 8 bits. When I changed that to use the wxString::From8BitData() routine as well, the whole thing seemed to fall into place.
I had one confusion and I still don't understand it. The TiXML classes have comments through them indicating that they use UTF8 by default and so I would have expected the Attribute() methods to return UTF8 character strings. I don't believe they do as, when I tried the wxString::FromUTF8() call instead of wxString::From8BitData() call, it produced an empty string.
I can live with this confusion. The only item left on this issue is that of supporting character encoding other than ISO-8859-1. This is certainly enough for English. I don't know whether you want to continue to investigate other options or whether this fix is enough for your needs.
I started looking at classes such as wxCSConv, wxFontMapper and wxLocale but I quickly became more confused.
I hope you don't mind but I have used your find (wxString::From8Bitdata()) and implemented the fix in both the HText.cpp and CxfFont.cpp files in Subversion.
Thanks
David Nicholls
Original comment by David.Ni...@gmail.com
on 9 Jun 2010 at 3:57
David,Guido
Thx for your Work on doing this.
Working in Heekscad with this NON ASCII-Chars looks good.
The last Problem now is loading the saved .heeks File.
The Errormessage "Error reading Attributes." is displayed...
When I open the .heeks File in Editor and delete the special Chars,
I can load the File (without the special Chars)
Hans
Original comment by aachen.h...@googlemail.com
on 9 Jun 2010 at 8:38
Hans,
do you mean you can't load the 'unknown.heeks' file attached to this fault? If so I will need to keep looking at it. I can load that file fine on my machine. That is what I have been using for testing.
If it is a different file, please attach it to this issue.
Thanks
David Nicholls
Original comment by David.Ni...@gmail.com
on 9 Jun 2010 at 9:20
David,
Sorry,I have forgotten this Attachment ...
The first File unknown.heeks = no Problems
The new one unknown2.heeks faults...
Hans
Original comment by aachen.h...@googlemail.com
on 9 Jun 2010 at 1:32
Attachments:
David,
thanks for the quick fix! I might have found the location of the problem, but
you can better oversee the places where the fix needs to be applied.
Hans, your report about the behavior after saving and loading actually
triggered me to look at the right places in the code.
David, yes, for me this fix is sufficient. But we could also think of saving
and loading the XML data as UTF8 (using ToUTF8() and FromUTF8() in HText) and
just use To8BitData() in CxfFont.cpp.
I mean this would be a good idea in general to do any conversion between
wxString and XML with the UTF8 conversion methods.
BTW, I tried to convert some ttf to cxf (using ttf2cxf), but the resulting
Times_Roman.cxf could not be properly addressed. HeeksCAD just displays
rectangles for all characters. Instead of having just the character in brackets
(like "[a]"), the cxf generated from ttf2cxf contains something like "[#0061]".
Well, that should better end up in seperate issue.
Guido
Original comment by g.mue...@gmail.com
on 9 Jun 2010 at 7:55
Guido,
I didn't know there was any such thing as 'ttftocxf'. I will have a look at it. The CxfFont.cpp file does try to interpret the [#0061] format but it must not do it correctly.
I will have another look at reading the XML data as UTF8 but my first try didn't work. There may be some conversion in TiXML that I'm not expecting.
Leave this fault open for these changes.
Thanks
David
Original comment by David.Ni...@gmail.com
on 9 Jun 2010 at 9:33
David,
it worked for me when changing both: HText::WriteXML() and
HText::ReadFromXMLElement() to the UTF8 methods. OK, you need to save to UTF8
first or create a new *.heeks file.
I was wondering if it is worth to go throught the whole code and see if we
could eliminate the whole Ctt() and Ttc() conversion with the UTF8 methods. I
have seen that Ctt() and Ttc() was also used for converting between filenames
and wxStrings, etc.
But I am not sure if filenames and paths should be handled in UTF8. it also
might be quite some work...
Guido
Original comment by g.mue...@gmail.com
on 9 Jun 2010 at 10:11
Guido,
I agree that it's worth changing both HeeksCAD and HeeksCNC to use UTF-8 in all circumstances. If we don't do this then we're just going to bump into the same problem in another place.
I am playing with the idea of changing the definitions of Ttc() and Ctt() so that they're as shown below. It will be quite a bit of work but I'm happy to do it. It may just take me a day or two to get it done.
The examples would be;
#define Ttc(s) (const char *) wxString(s).mb_str(wxConvUTF8)
inline wxString Ctt(const char String[] = "")
{
return wxString(String, wxConvUTF8);
}
These definitions seem to work for the Linux build but I need to make sure they also work for the Windows build. I'm not sure whether Dan uses the Unicode builds or the Release (non-Unicode) builds when he releases the Windows version of Heeks.
I also had a look at the ttf2cxf conversion utility. It does an excellent job. I can see that the character names are stored in the [#0228] format but I'm still working through how to use this value. It's obviously a 16-bit number. I was using a wxChar as the key into each font map but I think I need to change that to an 'unsigned long'. That's what the character names are stored as within the ttf2cxf conversion code. I think that means that the characters are all being stored as Unicode characters. With our other changes we're reading and writing UTF-8 characters into and out of the wxString variables. From my reading the wxString always stores the characters in Unicode internally. I am expecting to be able to read the Unicode character back when I need to compare it with the font map we read in during the CXF font parsing process.
I may have missed something but I think it will all work out alright. It will just take me a little while.
Thanks
David
Original comment by David.Ni...@gmail.com
on 10 Jun 2010 at 11:40
David,
but be careful where to use UTF8. For the ASCII character set it is one byte,
but it can also be multiple bytes per character (up to 4).
So if there are places where you expect a one byte value in order to do some
lookup (like the fonts) this might fail badly.
Guido
Original comment by g.mue...@gmail.com
on 10 Jun 2010 at 7:26
Guido and Hans,
I have gone ahead with this change as I believe it to be correct for both Linux and Windows. People will either be quite happy with the change or they will be out buying pitch forks tonight.
The error that was seen when the [#0021] form of the character 'name' was used was not due to a char versus wchar_t difference as I had expected. It was due to the way I had constructed my CXF file parsing loop. It was simply skipping the characters that used this naming convention.
I can now embed characters with the umlaut and render them using a converted TrueType font (via ttf2cxf). I'm so pleased you directed me to this utility. I will definitely use it from now on. I had been using the 'fonttracer' utility (I think that's what it was called). The ttf2cxf conversion is much more convenient.
Unless I've forgotten something, I think this solves all the problems included in this issue. If I have missed something please let me know.
Thanks
David Nicholls
Original comment by David.Ni...@gmail.com
on 11 Jun 2010 at 4:39
Guido,
I have added a note to the Fonts wiki so that people will know that the ttf2cxf conversion utility exists.
Thanks
David
Original comment by David.Ni...@gmail.com
on 11 Jun 2010 at 4:55
Hi David/Guido
Let's put it like this: Great Work !
For my Side it looks nice...
When you use ttf2cxf have a look at this :
http://www.ribbonsoft.com/rsforum/viewtopic.php?t=415
THX
Hans
Original comment by aachen.h...@googlemail.com
on 11 Jun 2010 at 9:27
David,
With the Windows build, when I try to open a STEP file or a heeks file which
contains solids, I get "STEP import not done!". It seems to be because of the
change to the function "Ttc". On this line:
Standard_CString aFileName = (Standard_CString) (Ttc(filepath));
( line 709 Shape.cpp )
filepath looks correct, but aFileName seems to be corrupted.
Maybe your changes should be done just for Linux?
Dan.
Original comment by danhe...@gmail.com
on 14 Jun 2010 at 4:49
Dan,
can you attach a STEP file to this issue for me to reproduce the problem with? I would be eager to leave the change in for both Windows and Linux as it makes the language handling consistent.
Ta
David
Original comment by David.Ni...@gmail.com
on 14 Jun 2010 at 10:20
David,
I have tried it with many STEP files, including "cube.step".
You can easily make your own by creating a cube and then doing save, as
"cube.step".
Dan.
Original comment by danhe...@gmail.com
on 14 Jun 2010 at 10:31
Dan,
I think this last change fixes the code for all situations (famous last words). I believe the problem was due to the Shape code holding a const char * pointer to what is a very temporary buffer. It now uses the static std::string concept as per the last version except that it still supports the UTF-8 format if needed. This should work for all languages. It's still a little dangerous but I think that, since we're not using Ttc() twice between getting the pointer and using it, we're getting away with it.
If you still have trouble, please let me know.
Ta
David
Original comment by David.Ni...@gmail.com
on 15 Jun 2010 at 11:05
David,
Opening a STEP file works OK now for Windows and Linux, for me, since your
recent changes.
Thanks.
Dan.
Original comment by danhe...@gmail.com
on 15 Jun 2010 at 1:31
Original comment by David.Ni...@gmail.com
on 15 Jun 2010 at 1:40
Original issue reported on code.google.com by
g.mue...@gmail.com
on 6 Jun 2010 at 10:00