richtabor / MerlinWP

Better WordPress Theme Onboarding
https://merlinwp.com
GNU General Public License v3.0
690 stars 138 forks source link

More content import issues on garbage hosting… #104

Open contempoinc opened 6 years ago

contempoinc commented 6 years ago

I'm getting more and more reports of buyers trying to import on crap shared hosting, anyword on when the image import on/off option will be implemented?

(cc @capuderg @richtabor)

capuderg commented 6 years ago

Hi,

would such an option really be the solution?

What kind of errors do they get? If you can, ask them, for PHP error logs and see what kind of error it returns. Maybe we can fix that in the Merlin importer...

contempoinc commented 6 years ago

Timeout/admin ajax issues from the importer.

capuderg commented 6 years ago

I would need more specific PHP error logs (error messages), so we know what exactly caused the issue and we can try to solve it.

contempoinc commented 6 years ago

Next one I run into I’ll check their raw logs.

rubenbristian commented 6 years ago

Hi

I am also running into the same issues.. I cannot get the import to complete, but i'm testing locally on dev server.. PHP Memory limit, execution times are all maxed out. The importer downloads a handful of media then it throws the "success" message.. Cannot get it to run till the end. The PHP log is clean.

So if i cannot get it working locally, i thought that it's because of many resources and i will never get it working for a large xml file.

However, when i've tested on WPEngine it worked.. So i am not sure what to believe now..

richtabor commented 6 years ago

I have not had reports of a failed import, though my themes' data is substantially minimal.

rubenbristian commented 6 years ago

I've tested a really large import on a few hosting companies with their cheapest package and on some it went well (including a HostGator one for which i've payed $0.01:), on some it failed.

Here's a failed log.. I see that there is a missing media error - could that make the import fail? main.log

capuderg commented 6 years ago

Hi,

This log file reports of finishing the job till the end. It has the "The final step has been displayed" message at the end of the log file. And the widgets and customizer settings get imported as well.

The error of the missing media is not causing it to fail, because it gets logged just OK:

How exactly does it fail? Do you get a red checkmark on content import?

Did you check PHP error logs on the server as well? After the content failed to import?

Take care!

rubenbristian commented 6 years ago

Hi

So, the import fails because it only loads about 30 media images, and then it triggers the success message. No pages, posts or custom post types, because it stops at a certain point in the media upload and cannot go further, ignoring the rest of the xml import file..

If i would do this via regular WP importer, it would throw me a 500 internal server error, showing that something is clearly wrong. When i use Merlin, it doesn't throw this error, and the importer acts like it is completed, but it isn't..

I've attached the complete log, for the same xml file. You can see that it's much larger and that the pages and posts go through..

Of course, this is an edge case example with a huge import, but the main issue is that Merlin is showing that everything went well, when it doesn't.. If this could be improved, it would be awesome.

main (1).log

rubenbristian commented 6 years ago

And for more info, here's one merlin and one php log from the same host (dreamhost cheapest package)..

Image of Error

timezone is different in the two logs, but you can see what's happening at 01:52, when the importer stops importing media items and goes at the widgets

dreamhost-melin_log.txt dreamhost-php_log.txt

rubenbristian commented 6 years ago

Final reply, i promise :)

I've found the 500 error.. It's thrown and after that, the wizard thinks that the import is done. Could the server response from the ajax be checked before showing up the success message? And if a http error code is thrown then a certain editable message be shown?

I think that it's related with #100

screen shot 2018-05-09 at 10 13 51

contempoinc commented 6 years ago

This is the same error I was getting due to cheap server environment timeouts, what's funny is using the WordPress V2 Importer on its own with the XML the batch processor doesn't have any issues pushing large amounts of images, content, through…thought Merlin was using the Importer V2?

capuderg commented 6 years ago

Merlin is using the backend code of the v2 importer, which is optimized for memory usage, but it's not using the frontend. v2 importer is using another technology for making requests to the backend, which is not compatible with all browsers, so if we would use that, even more users would have problems with it. That's why we are using the old AJAX technology, to make it as compatible with everything as possible.

I see, that the Merlin importer is ignoring the errors and going forward with the import, which has to be resolved, however, we still don't know, why the 500 errors are thrown... It would be nice, to know why and catch these errors if possible.

Any help (PRs) on fixing https://github.com/richtabor/MerlinWP/issues/100 would be greatly appreciated.

rubenbristian commented 6 years ago

500 errors are thrown even with the xml importer, how do you catch them? I'll try to look into #100 today..

capuderg commented 6 years ago

@rubenbristian great!

As for the 500 errors... If it's an error, that we can prevent, then we should do that. But we first need to know, what is causing it. I'm not convinced, that it's a timeout issue, since the new AJAX calls are made every 20 seconds, and almost all servers have a minimal timeout set to 30 seconds...

rubenbristian commented 6 years ago

How do you know what's causing the 500 error then? Because i don't know how to test? Would it help if i give you access to a hosting account which throws that error?

On Thu, May 10, 2018 at 11:02 AM, capuderg notifications@github.com wrote:

@rubenbristian https://github.com/rubenbristian great!

As for the 500 errors... If it's an error, that we can prevent, then we should do that. But we first need to know, what is causing it. I'm not convinced, that it's a timeout issue, since the new AJAX calls are made every 20 seconds, and almost all servers have a minimal timeout set to 30 seconds...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/richtabor/MerlinWP/issues/104#issuecomment-387982587, or mute the thread https://github.com/notifications/unsubscribe-auth/ACeCChwjJPc_T5xSy4btSjTXfx09hRRNks5tw_QQgaJpZM4Th7D8 .

--

Ruben Bristian, Co-founder @ KrownThemes

capuderg commented 6 years ago

That's the problem. I don't know what's causing it. If there is a PHP error that is causing it, then there should be an indication of that error in the PHP error logs.

The other thing, that could be done, is to debug the issue and try to find where it breaks.

Now, that I'm thinking about it... it is possible, for the importer to timeout. Let's say, that an image is getting imported on the 19th second of the current AJAX call. This image is very big and takes a lot of time to import. With the added "slowness" of the server it might take more then 10 seconds, which would result in total time of the AJAX call to be more then 30 seconds and the server to time-out.

However this was not the case on the above example from @rubenbristian, since the execution time of the AJAX call, which failed was only 6 seconds. So, something else must be going on. Maybe it's a use case, which is not covered in the importer code. Maybe it's something about server/PHP configuration. This is what we have to find out.

rubenbristian commented 6 years ago

The issue is the same. Timeout errors cause 500 on some servers, just tested this.. I'm trying to import a single large video (70mb) on localhost, and after 30sec (max exec time), it fails, gives 500 error, and importer triggers success message..

So it's the exact same thing with #100

Working on the PR now.

contempoinc commented 6 years ago

Has development stopped, or?

capuderg commented 6 years ago

We've been busy with other things, so this project was put on sidelines. We'll get back to it in the following weeks.

redlagoon commented 6 years ago

Just trying to make sure before I say anything concrete -

As far as I see, Merlin downloads the file before it does anything, so the WXR Importer doesn't have to wait for the Downloader to finish, correct?

If so, that 500 issue is coming from the server not being able to handle big images / big files imports that it has to do, when the user is importing the actual .xml file with the option to also import images?

So the issue lies within the server being just too damn slow to process these big files in a reasonable time?

I'll suggest an intermediary fix, which I believe makes the best of both worlds - create your content file, but on your demo (that you're exporting from), use optimized JPEG one-color images, while I do understand that some demos might look bad, it's the content that matters most to people.

Hopefully I can push some files to here in a few days where an option is given to users: Import the images in their full glory, or just go with the lite version.

contempoinc commented 6 years ago

@redlagoon That about hits the nail on its head from what I've seen on many different buyers cheap shared environments.

redlagoon commented 6 years ago

@contempoinc Alrighty, then it means that, even if the WXR Importer was bad and queried a lot, small images should generally fix the issue.

The thing with Merlin itself is that it automatically fetches the attachments.

$this->importer = new ProteusThemes\WPContentImporter2\Importer( array( 'fetch_attachments' => true ), $logger );

You can leave it as it is, but please try this. Force your demo images to default to an optimized, small-sized image. Create an image that's one color, gray-scale and settings on low.

Then use http://optimizilla.com/ or any tool to optimize it as good as you can.

Finally, ship the XML file using that.

Additionally (but it requires more work), you can have a 1x1 JPG and you can set the background to repeat itself then re-write this behavior with your images, when users actually upload an image.


Additionally, is this all worth it? From my experience, from quite a lot of sales / interaction with people is that these issues happen only once ever a few dozens of times and, at best, they'd just have to create their demo (let's face it, demo imports don't last and it's mostly a gimmick, no one ends up using them, or even build upon them, at least in my experience). Does it happen more often to you?

contempoinc commented 6 years ago

@redlagoon I've already done all the optimization I can as well as moving everything to a fast as server, but I only get a few of these requests per week then I'll recommend moving to a better host (referral money) and if they don't want to or their "client" can't then I just tell them the demo imports are optional, etc…

However it still would be nice to have the importer really break the images up into chunks like the WordPress Importer V2 plugin does, and I know Merlin is using a fork of said import class but when I use that plugin on these crap hosts as a last-ditch effort it'll actually work with the same XML files.

redlagoon commented 6 years ago

So it's simply a Merlin issue, you're saying?

1) Any chance you can put up these problematic XML files?

2) Can you please try to use OCDI on the same XML file?

capuderg commented 6 years ago

The OCDI and Merlin use the same forked WP importer v2 (from humanmade), which is maintained by us, so it should produce the same results.

At that time, the v2 Importer was in active development and it looked like it will replace the original WP importer some day, but a few months later development stopped.

We've forked the project and stripped away all the UI stuff, so we ended up, just with the core of the importer, which we used for the OCDI plugin and later for Merlin. We fixed a few issues, that were reported and we maintain it.

The main task, to solve the big files importing, is to check how the original WP importer is solving this issue, since you said, that importing with it always works.

I just don't know, how a server times out with Merlin or OCDI, which is using AJAX calls to make them shorter and easier to digest by the server, while the original WP importer doesn't time out, but imports everything in one step.

In theory this doesn't make any sense, so maybe something else is causing it to fail...

redlagoon commented 6 years ago

Got it. I'm currently testing Merlin's Importer vs what WordPress on 4.9.6 uses and also OCDI's to see the differences.

I'll post results as soon as I can, but really busy nowadays.

contempoinc commented 6 years ago

@capuderg The current/old WordPress Importer doesn't work with large XML files.

The only solution I currently have for my buyers on cheap, slow shared hosting is to physically install the WordPress Importer V2 plugin (as it sits on GitHub right now), and manually upload the XML they'd like, let it run do its thing, 99.999% of the time it works perfectly, images, pages, posts, terms. etc…everything are then imported.

Which is why its weird if Merlin is using the exact same code base as that plugin, it shouldn't be timing out then?

capuderg commented 6 years ago

Our forked v2 importer used in OCDI and Merlin is not totally the same. We stripped it down to only the import core (removed the UI and other things), so we can break the import in smaller chunks with multiple AJAX calls, since they are supported by all browsers. On the other hand, the v2 importer uses EventSource for creating a connection to the server and importing. Maybe that's the thing, that solves the big file imports, however, this technology is not supported by all browsers (IE and EDGE do not support it).

We used AJAX just because it's supported in all browsers and all users can use the OCDI or Merlin.