openlab-at-city-tech / webworkqa

WeBWorK integration for WordPress and BuddyPress
GNU General Public License v2.0
4 stars 2 forks source link

Images should be imported to WP at the time of problem import #39

Closed boonebgorges closed 6 years ago

boonebgorges commented 8 years ago

These images are specific to the user-problem combo (they may be generated with personalized values) so there could be lots of them.

Tasks to complete:

boonebgorges commented 7 years ago

I'll look at this for the spring, though I think it's somewhat lower priority than other potential enhancements.

boonebgorges commented 6 years ago

A first pass at this is now ready to test. Newly imported questions containing images (http://mathww.citytech.cuny.edu/webwork2/WW-Dev/CoordinatePlaneTrig/1/?effectiveUser=bgorges&key=K0tQJiYnMewkgRHIfagewmJokX5KmNSo&user=bgorges is an example) will trigger WP to import the image, and then swap out the URLs in the Problem Text HTML so that they point to the WP version of the image.

I haven't yet done anything to import images for existing problems. This is next on my list.

boonebgorges commented 6 years ago

I've written a script that attempts to download images for old posts. See eg http://openlabdev.org/webwork-playground/#:problemId=local/ShiftingParabolas/shift-down.pg:questionId=11501, which has now been moved over to webwork-playground.

The script is as follows:

<?php

global $wpdb;

$question_ids = $wpdb->get_col( "SELECT ID FROM {$wpdb->posts} WHERE post_type = 'webwork_question'" );
foreach ( $question_ids as $question_id ) {
    $question = new WeBWorK\Server\Question( $question_id );
    $old_pt = $question->get_problem_text();
    $question->fetch_external_assets();
    $new_pt = $question->get_problem_text();

    if ( trim( $old_pt ) === trim( $new_pt ) ) {
        WP_CLI::log( "No images fetched for $question_id" );
    } else {
        $question->save();
        WP_CLI::log( "Images swapped for $question_id" );
    }
}
boonebgorges commented 6 years ago

Regarding the third checkbox at the top ("Ensure that if the problem text used as the "primary" text...") - This was meant to address a concern stated (by Andrew or Jonas, I can't remember which) that there might be a problem that has some questions whose WW images have been deleted, but others whose WW images can be imported. I have just run a script (on the production site data) to check this, and there are not in fact any such instances. We can keep the script so that it can be rerun before the final import, but I don't think that this kind of fallback will be necessary since no problems meet this criteria.

<?php

global $wpdb;

$question_ids = $wpdb->get_col( "SELECT ID FROM {$wpdb->posts} WHERE post_type = 'webwork_question'" );
foreach ( $question_ids as $question_id ) {
    $question = new WeBWorK\Server\Question( $question_id );
    $pt = $question->get_problem_text();

    if ( empty( $pt ) ) {
        continue;
    }

    $has_external_images = bbg_has_external_images( $pt );
    if ( 'no_images' === $has_external_images || 'has_no_external_images' === $has_external_images ) {
        continue;
    }

    WP_CLI::log( "Question $question_id has external images" );
    $siblings = new \WeBWorK\Server\Question\Query( array(
        'problem_id' => $question->get_problem_id(),
    ) );

    $sibling_with_no_external_images = null;
    foreach ( $siblings as $sibling ) {
        $sibling_has_external_images = bbg_has_external_images( $sibling->get_problem_text() );

        if ( 'no_images' === $sibling_has_external_images || 'has_external_images' === $sibling_has_external_images ) {
            continue;
        }

        $sibling_with_no_external_images = $sibling->id;
    }

    if ( $sibling_with_no_external_images ) {
        WP_CLI::log( "$question_id has sibling $sibling_with_no_external_images with no external images" );
    }
}

function bbg_has_external_images( $text ) {
    $d = new \DOMDocument();
    $d->loadHTML( $text );

    $imgs = $d->getElementsByTagName( 'img' );
    if ( ! $imgs ) {
        return 'no_images';
    }

    $has_external_images = 'has_no_external_images';
    foreach ( $imgs as $img ) {
        $src = $img->getAttribute( 'src' );
        if ( ! $src ) {
            return 'no_images';
        }

        $src_domain  = parse_url( $src, PHP_URL_HOST );
        $home_domain = parse_url( home_url(), PHP_URL_HOST );
        if ( $src_domain !== $home_domain ) {
            $has_external_images = 'has_external_images';
            break;
        }
    }

    return $has_external_images;
}
boonebgorges commented 6 years ago

I believe we are ready for testing. Briefly:

  1. Find a WW question with an image. "ask for help", and then make sure that the question you create has problem text that contains an image stored on openlabdev.org and not the WW site.
  2. Look through old questions (on openlabdev.org) to make sure you don't see any images still stored on the WW site. http://openlabdev.org/webwork-playground/#:problemId=local/ShiftingParabolas/shift-down.pg:questionId=11501 is an example of an item that has been successfully imported.
boonebgorges commented 6 years ago

Thought more about my point 2 above and I think I'm incorrect - the concern is that future questions (which will have their images imported) will be posted to problems that have existing questions with unimported questions. So this bit of logic still needs to be written. I'll work on it.

boonebgorges commented 6 years ago

I've implemented the fallback logic described in the third checkbox above. Briefly: If the "primary" question associated with a problem contains unimported images, then check to see whether another question associated with that problem has had the images imported successfully; if so, show the text of that problem instead.

In order to test this on webwork-playground, I had to manually modify some content. On the problem page http://openlabdev.org/webwork-playground/#:problemId=local/CoordinatePlaneTrig/six-trig-point-q1.pg, my own user (boonegorges) ought to see question 11643 as the "primary" text http://openlabdev.org/webwork-playground/#:problemId=local/CoordinatePlaneTrig/six-trig-point-q1.pg:questionId=11643 But because the images there are broken, a different question's problem text is swapped in. See screenshot. i'm unsure whether this is enough for others to test - the whole thing is quite confusing :)

See also my suggested gloss text in the screenshot. Better ideas for text are welcome.

screenshot_2018-01-02_14-32-49

bree-z commented 6 years ago

Sorry, I'm still a little confused by this one! @moui72 perhaps you want to take a look? Otherwise, we might need more instructions.

boonebgorges commented 6 years ago

Hi @bree-z - Sorry, the last part of this one is very complicated, but the first part is not. The most important thing to test on this ticket is that when starting with a WW question that contains images, the problem text on the WP site (after the question has been saved) should contain a copy of the image that is served from the WP site.

bree-z commented 6 years ago

Ok, thanks @boonebgorges! That part is working for me.

E.g., I asked this question, and the image URL is: http://openlabdev.org/wp-content/uploads/2018/01/5f857248-3fdc-335c-b30b-a6e40fefe79a___7edb8295-3673-3e89-aca0-60a2aa2f6bd1.png

I got the same result for a few other questions with images.