xdamman / mediumexporter

Export your stories published on medium.com to markdown for easy import
MIT License
235 stars 32 forks source link

wrong parsing | missing first image #29

Open rafapolo opened 5 years ago

rafapolo commented 5 years ago

Had to export hundreds of medium posts for a project and realized a wrong pattern in which the first image are not captured by mediumexporter.

to reproduce:

$ mediumexporter https://medium.com/@AreYouSyrious/ays-news-digest-08-02-17-uk-reneges-on-the-dubs-agreement-destroys-hopes-of-refugee-children-97cc0ca96fe0 > output.md

markdown output:

# AYS NEWS DIGEST 08.02.17 — UK reneges on the Dubs Agreement, destroys hopes of refugee children

People camp outdoors in Paris. Photo Credit: Calais Action

### Feature
...

Algorithm misses first image, but not sure how to fix the parser.

evan-coygo commented 5 years ago

We're also having this issue

rafapolo commented 5 years ago

@xdamman I guess its related to parsing the metadata of the first paragraph... if you can give me some light I could try to do some nodejs to help ;)

xdamman commented 5 years ago

Sorry I haven't been touching this repo in more than a year. I've shifted my attention to the climate emergency 🌍🔥⏳(if other open source developers want to join, please ping me!)

Any chance anyone else can take a look?

rafapolo commented 4 years ago

To compile the refugees situations are also being an emergency... I tried, but not comfortable with nodejs or medium outputs to fix it... medium-2-md & medium-to-markdown alternatives are even more obsolete.

rafapolo commented 4 years ago

my guess is that the type in the first paragraph is not being correctly parsed. https://github.com/xdamman/mediumexporter/blob/master/lib/utils.js#L149 line #39 is being, somehow, ignored https://codebeautify.org/jsonviewer/cb88e380

tvvignesh commented 4 years ago

I am facing the same issue as well. Missing the first image, but all other images come up correctly.