Sicos1977 / IFilterTextReader

A reader that gets text from different file formats through the IFilter interface
Other
55 stars 38 forks source link

One suggestion how to improve app performance #8

Closed win32nipuh closed 9 years ago

win32nipuh commented 9 years ago

I tested the application and found how to improve performance ;-)

IFilterTextViewer MainForm.cs

while ((line = reader.ReadLine()) != null) { //text += line + Environment.NewLine; // <--- error ;-) text = line + Environment.NewLine; FilterTextBox.AppendText(text); Application.DoEvents(); }

Sicos1977 commented 9 years ago

You are right.. that line is a little bit overdone :-) I modified the example program and missed that += completely.

I tried the RTF file... it worked without any problems overhere...

Did you use the demo program and this that one give the exception or did you use the FilterReader constructor with a stream input?

    public FilterReader(Stream stream,
                        string extension,
                        bool disableEmbeddedContent = false,
                        bool includeProperties = false)

rtf output

win32nipuh commented 9 years ago

I used both ways: demo and my application with stream. I have checked on another computer (W8.1PRO) - rtf works fine. But in any case it is interesting situation. Ok, thank you.

win32nipuh commented 9 years ago

Btw, I open xls file 2mb, and demo app works on it up to 1 minute. I did it on 2 machines - it works so long on both. Is it possible to improve performance?

Sicos1977 commented 9 years ago

Then it probably is a problem with the local IFilter, meaby it is corrupt. Just try to reinstall the filter. I also have Windows 8.1 Pro overhere and that isn't giving me any problems.

Can you send me the xls file?

Sicos1977 commented 9 years ago

You could try to read the file in blocks like this:

var buffer = new char[8192]; while (reader.ReadBlock(buffer, 0, 8192) > 0) { FilterTextBox.AppendText(new String(buffer) +Environment.NewLine); Application.DoEvents(); }

FilterTextBox.AppendText(Environment.NewLine + "* DONE *" + Environment.NewLine); Application.DoEvents();

win32nipuh commented 9 years ago

That is my Excel file http://rghost.net/6qYSSDFBB

win32nipuh commented 9 years ago

Yes, you are right, now it works 2-3 secs. :-) Thank you :+1:

win32nipuh commented 9 years ago

I am testing your lib in my SQL CLR function. I'd like to extract plain text from files and store only it instead of file content. I want to use it Full-test search queries.

Sicos1977 commented 9 years ago

I do something similar. I did write IFilterTextReader to recognize text inside e-mails and attachments.