mcs07 / ChemDataExtractor

Automatically extract chemical information from scientific documents
http://chemdataextractor.org
MIT License
287 stars 112 forks source link

Title fix #21

Open JeffersonH44 opened 6 years ago

JeffersonH44 commented 6 years ago

I found an error with this example:

--- example begin

3.2. Experimental Details

3.2.1. Synthesis of Phosphorus Ylide 5

N-Benzyl-2-chloroacetamide (2): Chloroacetamide 2 was prepared following the procedure described in the literature [23]. To a stirred solution of benzylamine (7.8 mL, 70.8 mmol) in toluene (60 mL) under cooling with ice bath, chloroacetyl chloride (4 g, 35.4 mmol) was slowly added. The reaction mixture was stirred vigorously for 1h at room temperature. The solvent was evaporated under vacuum, the crude reaction was dissolved in dichloromethane (100 mL) and washed with water (3 × 50 mL). The organic layer was dried over anhydrous MgSO4, filtered and the solvent evaporated under vacuum. The product was obtained as a white solid (6.30 g, 97%). m.p. 91–92 °C (93–96 °C from literature) [23]; 1H-NMR (CDCl3) δ 4.11 (s, 2H), 4.50 (d, 2H, J = 6.0 Hz), 6.89 (br s, 1H), 7.26–7.36 (m, 5H, Ar-H).

1-Benzyl-5-(chloromethyl)-1H-tetrazole (3): Compound 3 was prepared by an analogous method to that described in the literature [24]. PCl5 (7.06 g, 33.9 mmol) was added slowly to a solution of N-benzyl-2-chloroacetamide (5.66 g, 30.8 mmol) in toluene (50 mL) under cooling with ice-water bath. The mixture was stirred at room temperature for 2 h, then NaN3 (3.01 g, 46.3 mmol) was added. The reaction mixture was stirred at room temperature for 30 min, water (0.8 mL) was added dropwise and the whole was refluxed for 5 h. After cooling, the reaction mixture was poured into water and extracted with chloroform. The combined organic layers were washed successively with water, NaOH solution 1M and saturated NaCl solution and dried over anhydrous MgSO4. After removal of the solvent, the crude product was purified by flash chromatography (ethyl acetate/hexane (1:2)) affording the tetrazole 3 as light yellow solid (3.47 g, 54%). m.p. 57–59 °C (from diethyl ether) (62–63 °C from literature) [24]; 1H-NMR (CDCl3) δ (ppm) 4.62 (s, 2H), 5.68 (s, 2H), 7.28–7.30 (m, 2H, Ar-H), 7.39–7.40 (m, 3H, Ar-H). --- example end

The problem was that the title had more importance than the name found in the paragraph, this make the extractor assign the first NMR spectra to Phosphorus Ylide 5 instead of N-Benzyl-2-chloroacetamide (2), so what I did was to give priority to the last_id_record if contains any name, that basicaly solves the issue, and also the way that you manage head_def_record_i was causing conflicts on the assignments of the NMR's, for this case, it was the last NMR detected was assigned to N-Benzyl-2-chloroacetamide (2) instead of 1-Benzyl-5-(chloromethyl)-1H-tetrazole (3).

PD: To check None values it's better to check it with my_variable is not None that my_variable inside of an if statement, I told you this because head_def_record_i on document.py is expected to be a Number or None, if head_def_record_i is 0 this could lead to an unexpected behavior

JeffersonH44 commented 6 years ago

@mcs07