Freeseer / freeseer

Designed for capturing presentations at conferences. Pre-fill a list of talks to record, record them, and upload them to YouTube with our YouTube Uploader.
http://freeseer.readthedocs.org
GNU General Public License v3.0
215 stars 110 forks source link

#555 - RSS feed parser not grabbing entire content #555

Open jameskunjoolee opened 10 years ago

jameskunjoolee commented 10 years ago

There are two instances in which the parser does not grab the entire content of a feed. They are both lists.

Instance 1
Expected:
"Design to Theme in Five"
You've designed this super amazing Web site in Photoshop (or Illustrator or GIMP or Inkscape or...) and then you hand it over to some programmer and now the Web site doesn't look anything like your design. BOO! HISS!
In this session you'll learn how to convert your own designs into your own awesome Drupal themes in five easy steps.\nWe'll cover the basics of how to:    
* Optimize your design files to make theming easier.
* Evaluate common base themes and know when to choose between several popular base themes (e.g. 960.gs, Zen).
* Create a new Drupal theme by extending a base theme.
* Develop common template files (tpl.php) necessary to theme pages and nodes using a text editor.
* Share your designs with others (licensing, uploading to drupal.org and selling your themes).
Whether you want to build and sell your own designs, or you're a newly hired designer at a Drupal Web development shop, this session will give you the confidence to transform your imagination into a working Web site.
Prerequisite: This is for Intermediate Drupal Users. For beginners or those evaluating Drupal for the first time, we highly recommend attending the Drupal KickStart Program on Monday, May 3rd.
Actual:
"Design to Theme in Five"
You've designed this super amazing Web site in Photoshop (or Illustrator or GIMP or Inkscape or...) and then you hand it over to some programmer and now the Web site doesn't look anything like your design. BOO! HISS!
In this session you'll learn how to convert your own designs into your own awesome Drupal themes in five easy steps.\nWe'll cover the basics of how to:    
Instance 2
Expected:
This talk discusses techniques for supercharging your Web 2.0 startup via open-source search technologies. 
The more data you have available today, the more you can confuse, befuddle and disturb your user.... unless you have a mature information search strategy.
In this talk, we discuss enterprise-ready Open-Source technologies available. These can help you to
    * Enhance your user's experience on your web site.
    * Easily find information in your enterprise repositories
    * Strengthen partnerships by making it easy for external partners to work with your information resources
    * Drive more traffic to your site via SEO-friendly search strategies
Content Covered:
    * The Search Stackƒ
          - Getting Data: Web Crawlers + RSS/ATOM
          - Getting Data: Document Adapters
          - Information Extraction
          - Indexing
          - Querying
    * Different Search Scenarios
          - Web Search
          - Enterprise Search
          - Extranet Search
    * Crash Course: Deploying Search using SOLR
    * Advanced: SEO-friendly Search Strategies
Actual:
This talk discusses techniques for supercharging your Web 2.0 startup via open-source search technologies. 
The more data you have available today, the more you can confuse, befuddle and disturb your user.... unless you have a mature information search strategy.
In this talk, we discuss enterprise-ready Open-Source technologies available. These can help you to

Relevant: #506.

dideler commented 10 years ago

Nice find. Perhaps it would be better to have the test failing since it turns out that the parser has issues. Would you or @jh0720 be interested in further investigating the issue?

jameskunjoolee commented 10 years ago

The test now fails - presentation_feed.json now contains the expected result. After testing cli.py I can try looking into this issue.

dideler commented 10 years ago

@jameskunjoolee please do investigate, I've assigned you to the issue.

I recommend putting the CLI tests on hold on upping the priority on this. Tests let us discover issues like this, and when a test fails, the problem should be fixed as soon possible (all tests should be passing).

dideler commented 10 years ago

@jameskunjoolee to respond to your question on IRC, the branch for this fix should be based off of upstream/master, not your rss testing branch. The fix is independent of the tests.

jameskunjoolee commented 10 years ago

Update: It looks like the offset here: https://github.com/Freeseer/freeseer/blob/master/src/freeseer/plugins/importer/rss_feedparser/__init__.py#L112 is part of the problem.
This offset works for content that is not separated by <br /> tags.