jekyll / jekyll-import

:inbox_tray: The "jekyll import" command for importing from various blogs to Jekyll format.
https://import.jekyllrb.com
MIT License
520 stars 316 forks source link

Error when importing from drupal 7 #334

Closed ccamara closed 6 years ago

ccamara commented 6 years ago

Disclaimer: I am a newbie to jekyll (I've used in a couple of simple sites) and to ruby.

I wanted to migrate from a drupal 7 site and haven't been able to do it despite I think I've been following the instructions.

That's what I've done so far:

  1. Create a blank jekyll site (jekyll new myblog)
  2. Navigate to myblog site.
  3. Run the following code:

$ ruby -rubygems -e 'require "jekyll-import"; JekyllImport::Importers::Drupal7.run({ "dbname" => "name", "user" => "myuser", "password" => "mypassword", "host" => "myhost", "prefix" => "mytableprefix", "types" => ["blog", "post"] })'

(obviously I replaced my database name, user, password, host, prefix and types).

And that's what I get:

Configuration file: /www/jekyll-test/myblog/_config.yml /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:90:in block (2 levels) in process': undefined methodforce_encoding' for []:Array (NoMethodError) from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:89:in each_pair' from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:89:inblock in process' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/dataset/actions.rb:151:in block in each' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/mysql2.rb:238:inblock (2 levels) in fetch_rows' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/mysql2.rb:238:in each' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/mysql2.rb:238:inblock in fetch_rows' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/mysql2.rb:152:in _execute' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/utils/mysql_mysql2.rb:38:inblock in execute' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/database/connecting.rb:264:in block in synchronize' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/connection_pool/threaded.rb:91:inhold' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/database/connecting.rb:264:in synchronize' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/utils/mysql_mysql2.rb:38:inexecute' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/dataset/actions.rb:1085:in execute' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/mysql2.rb:273:inexecute' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/adapters/mysql2.rb:236:in fetch_rows' from /var/lib/gems/2.3.0/gems/sequel-5.3.0/lib/sequel/dataset/actions.rb:151:ineach' from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:80:in process' from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importer.rb:23:inrun' from -e:2:in `

'



Unfortunately, I cannot understand the error. What am I doing wrong?
rickysarraf commented 6 years ago

I can confirm the same issue. After upgrading to 0.13.0, #319 got fixed. But now I've run into this error.

rickysarraf commented 6 years ago

I think there's something wrong with the construct of:

# Get the relevant fields as a hash and delete empty fields
            data = data.delete_if { |_k, v| v.nil? || v == "" }.each_pair do |_k, v|
              (v.is_a? String ? v.force_encoding("UTF-8") : v)
            end

Because, when run hand picked, it is working as intended.

irb(main):010:0> s = "Ruby"
=> "Ruby"
irb(main):011:0> s.is_a? String
=> true
irb(main):012:0> s.force_encoding("UTF-8")
=> "Ruby"
irb(main):013:0> s
=> "Ruby"
irb(main):014:0> 
rickysarraf commented 6 years ago

Here's where the problem lies:

ruby 2.3.6p384 (2017-12-14) [x86_64-linux-gnu]
/var/lib/gems/2.3.0/gems/kramdown-1.14.0/lib/kramdown/compatibility.rb:43: warning: method redefined; discarding old <=>
/usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55: warning: loading in progress, circular require considered harmful - /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import.rb
    from -e:1:in  `<main>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:39:in  `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in  `rescue in require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in  `require'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import.rb:6:in  `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in  `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in  `require'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll/commands/import.rb:4:in  `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in  `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in  `require'
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/easyblog.rb:70: warning: assigned but unused variable - category
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/easyblog.rb:71: warning: assigned but unused variable - tags
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/easyblog.rb:36: warning: assigned but unused variable - section
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/marley.rb:59: warning: shadowing outer local variable - f
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/tumblr.rb:283: warning: mismatched indentations at 'end' with 'def' at 260
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/tumblr.rb:299: warning: mismatched indentations at 'end' with 'def' at 285
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/wordpress.rb:166: warning: shadowing outer local variable - status
/var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/dataset/query.rb:84: warning: statement not reached
/var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/model/associations.rb:2546: warning: assigned but unused variable - res
/var/lib/gems/2.3.0/gems/safe_yaml-1.0.4/lib/safe_yaml.rb:28: warning: method redefined; discarding old safe_load
/var/lib/gems/2.3.0/gems/psych-2.2.4/lib/psych.rb:290: warning: previous definition of safe_load was here
/var/lib/gems/2.3.0/gems/safe_yaml-1.0.4/lib/safe_yaml.rb:52: warning: method redefined; discarding old load_file
/var/lib/gems/2.3.0/gems/psych-2.2.4/lib/psych.rb:471: warning: previous definition of load_file was here
Configuration file: none
{"excerpt"=>"", "categories"=>[], "layout"=>"blog", "title"=>"RESEARCHUT \u00BB RESEARCHUT", "created"=>1295721846}
v is []
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:92:in `block (2 levels) in process': undefined method `force_encoding' for []:Array (NoMethodError)
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:90:in `each_pair'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:90:in `block in process'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/dataset/actions.rb:151:in `block in each'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:238:in `block (2 levels) in fetch_rows'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:238:in `each'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:238:in `block in fetch_rows'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:152:in `_execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/utils/mysql_mysql2.rb:38:in `block in execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/database/connecting.rb:264:in `block in synchronize'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/connection_pool/threaded.rb:91:in `hold'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/database/connecting.rb:264:in `synchronize'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/utils/mysql_mysql2.rb:38:in `execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/dataset/actions.rb:1081:in `execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:273:in `execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:236:in `fetch_rows'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/dataset/actions.rb:151:in `each'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:80:in `process'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importer.rb:23:in `run'
    from -e:2:in `<main>'
root@sid-container:~# 

As I understand, v is assumed to be not be null. But, in my case, there are blog pages without any categories associated. Which results in v being [].

{"excerpt"=>"", "categories"=>[], "layout"=>"blog", "title"=>"RESEARCHUT \u00BB RESEARCHUT", "created"=>1295721846}
v is []
rickysarraf commented 6 years ago

Okay. I figured to workaround this issue. The condition was failing for an unpublished entry in drupal, that the error was referring to. Since it was unpublished, the categories was resoling as empty, maybe. Because the draft copy did have a 'General' category assigned to it.

rickysarraf commented 6 years ago

Now it progresses further, but I have newer errors.

Configuration file: none
{"excerpt"=>" <p>It was <a href=\"http://www.researchut.com/blog/archive/2010/06/06/fuck-you-sony\">unfortunate</a> when Sony decided to pull out the Other OS support from PS3. One of the reasons of convincing myself to buy it was this feature. With that feature gone, the PS3 stood as nothing much but mostly a media center and an occasional game box.</p>", "categories"=>["rhut"], "layout"=>"blog", "title"=>"One week with the move", "created"=>1291409100}
/var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:92:in `is_a?': class or module required (TypeError)
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:92:in `block (2 levels) in process'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:91:in `each_pair'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:91:in `block in process'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/dataset/actions.rb:151:in `block in each'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:238:in `block (2 levels) in fetch_rows'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:238:in `each'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:238:in `block in fetch_rows'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:152:in `_execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/utils/mysql_mysql2.rb:38:in `block in execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/database/connecting.rb:264:in `block in synchronize'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/connection_pool/threaded.rb:91:in `hold'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/database/connecting.rb:264:in `synchronize'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/utils/mysql_mysql2.rb:38:in `execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/dataset/actions.rb:1081:in `execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:273:in `execute'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/adapters/mysql2.rb:236:in `fetch_rows'
    from /var/lib/gems/2.3.0/gems/sequel-5.0.0/lib/sequel/dataset/actions.rb:151:in `each'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importers/drupal_common.rb:80:in `process'
    from /var/lib/gems/2.3.0/gems/jekyll-import-0.13.0/lib/jekyll-import/importer.rb:23:in `run'
    from -e:2:in `<main>'

This error is more in-line with, maybe, a syntactical error.

irb(main):001:0> s = "Ritesh"
=> "Ritesh"
irb(main):003:0> s.is_a? String
=> true
irb(main):004:0> s.force_encoding("UTF-8")
=> "Ritesh"
irb(main):005:0> s.is_a? String ? s.force_encoding("UTF-8") : s
TypeError: class or module required
    from (irb):5:in `is_a?'
    from (irb):5
    from /usr/bin/irb:11:in `<main>'
irb(main):006:0> s.is_a? String 
=> true
irb(main):007:0> s.force_encoding("UTF-8")
=> "Ritesh"
irb(main):008:0> s
=> "Ritesh"

I think we are now getting the following very same error in this module.

irb(main):005:0> s.is_a? String ? s.force_encoding("UTF-8") : s
TypeError: class or module required
    from (irb):5:in `is_a?'
    from (irb):5
    from /usr/bin/irb:11:in `<main>'
rickysarraf commented 6 years ago

I am out of ideas on why the above is failing in the module run, given that it is string

rrs@priyasi:~$ irb
irb(main):001:0> 
irb(main):002:0* 
irb(main):003:0* s = " <p>It was <a href=\"http://www.researchut.com/blog/archive/2010/06/06/fuck-you-sony\">unfortunate</a> when Sony decided to pull out the Other OS support from PS3. One of the reasons of convincing myself to buy it was this feature. With that feature gone, the PS3 stood as nothing much but mostly a media center and an occasional game box.</p>"
=> " <p>It was <a href=\"http://www.researchut.com/blog/archive/2010/06/06/fuck-you-sony\">unfortunate</a> when Sony decided to pull out the Other OS support from PS3. One of the reasons of convincing myself to buy it was this feature. With that feature gone, the PS3 stood as nothing much but mostly a media center and an occasional game box.</p>"
irb(main):004:0> s
=> " <p>It was <a href=\"http://www.researchut.com/blog/archive/2010/06/06/fuck-you-sony\">unfortunate</a> when Sony decided to pull out the Other OS support from PS3. One of the reasons of convincing myself to buy it was this feature. With that feature gone, the PS3 stood as nothing much but mostly a media center and an occasional game box.</p>"
irb(main):005:0> s.is_a? String
=> true
irb(main):006:0> s.force_encoding("UTF-8")
=> " <p>It was <a href=\"http://www.researchut.com/blog/archive/2010/06/06/fuck-you-sony\">unfortunate</a> when Sony decided to pull out the Other OS support from PS3. One of the reasons of convincing myself to buy it was this feature. With that feature gone, the PS3 stood as nothing much but mostly a media center and an occasional game box.</p>"
irb(main):007:0> 
rickysarraf commented 6 years ago

Okay! I finally got the bug. As suspected, the issue lies with the syntax of: (v.is_a? String ? v.force_encoding("UTF-8") : v) which needs to be changed to: (v.is_a?(String) ? v.force_encoding("UTF-8") : v)

Kudos to: https://stackoverflow.com/questions/23372692/ruby-is-a-class-or-module-required-typeerror

DirtyF commented 6 years ago

@rickysarraf care to submit a PR so that we can release a patch version ?

https://github.com/jekyll/jekyll-import/blob/9f3b744cea999202ffe3a5149384e0cab7caae6c/lib/jekyll-import/importers/drupal_common.rb#L90

rickysarraf commented 6 years ago

This bug should be closed given that the relevant PRs have been merged.