chaddro / EasyPopulate-4.0

Data import and export module for Zencart 1.3.x and 1.5.x
GNU General Public License v2.0
24 stars 31 forks source link

Update request: admin/easypopulate_4_import.php #22

Closed kotsku closed 3 years ago

kotsku commented 8 years ago

Can you change fopen/update type from import file.

e.g: $handle = fopen($file_location, "r") to $handle = fopen($file_location, "rb")

Then update one row at a time to sql.

"i need a function to simulate a "wget url" and do not buffer the data in the memory to avoid thouse problems on large files..." - http://php.net/manual/en/function.fopen.php

Then do not need to split files.

mc12345678 commented 8 years ago

Per the associated reference: As of PHP 4.3.2, the default mode is set to binary for all platforms that distinguish between binary and text mode. If you are having problems with your scripts after upgrading, try using the 't' flag as a workaround until you have made your script more portable as mentioned before

Binary is forced by adding the b after r as suggested above. Per the notes of the PHP manual, the b is already assumed for systems that differentiate between the two and thus if operating on such a system the b would not be necessary.

As to the current usage/functionality, the code currently does not work like a wget url in that it is expected to read line by line (for systems that do treat the data as binary because of the above statement) and does not run out of memory... The reason for splitting has as of recent not been a memory issue but a timeout issue because of the number of lines to be read and operations to be performed. In fact the reason that this version was developed/split off as identified in the forum and elsewhere is that by reading line-by-line, the memory issue is avoided...

There are other reasons that could cause a memory issue, but the question related to this request is what is the problem that is occurring that needs to be resolved? How can it be duplicated?

The problems thus far seen with large files in my own testing have been timeout issues, not memory. Buffering has not been a problem therefore the above doesn't seem like it resolves a broad spectrum issue, but instead a localized issue of one where the source file ($file_location) is incorrectly/inconsistently generated. Doesn't mean that it 1) can't be added in nor 2) that it won't be included, but do want to understand through answer(s) to the questions in the above paragraph...

mc12345678 commented 8 years ago

Wanting to know if this can be closed out or not. Still haven't seen the impact of making the change described. Please respond with information requested above.

There are other reasons that could cause a memory issue, but the question related to this request is what is the problem that is occurring that needs to be resolved? How can it be duplicated?

mc12345678 commented 3 years ago

I think after 5 years of non-response and no further reported occurrences that this can be closed. The software still attempts to read one line at a time and in fact an improvement has been added which may not work on all systems, but there is action to extend/prevent the timeout from occurring. Some systems may ignore the request and a large file will still timeout. On another note, it does appear that it could be possible to still experience a memory issue even if timeout doesn't occur. In performed tests it seems that for each line read that there is some level of memory growth even after all data/variables generated have been unset. I haven't figured out the source of that issue, but can see the total memory used changes with each additional line read. As such it seems that at some point after reading enough lines that memory would be used up. May take a bit, but..

Closing.